Many multithreaded, grid-based, dynamically adaptive solvers for partial differential equations permanently have to traverse subgrids (patches) of different and changing sizes. The parallel efficiency of this traversal depends on the interplay of the patch size, the architecture used, the operations triggered throughout the traversal, and the grain size, i.e. the size of the subtasks the patch is broken into. We propose an oracle mechanism delivering grain sizes on-the-fly. It takes historical runtime measurements for different patch and grain sizes as well as the traverse's operations into account, and it yields reasonable speedups. Neither magic configuration settings nor an expensive pre-tuning phase are necessary. It is an autotuning approach. © 2012 Springer-Verlag.
|Original language||English (US)|
|Title of host publication||Parallel Processing and Applied Mathematics|
|Number of pages||10|
|State||Published - 2012|
Bibliographical noteKAUST Repository Item: Exported on 2020-10-01
Acknowledged KAUST grant number(s): UK-c0020
Acknowledgements: This publication is partially based on work supportedby Award No. UK-c0020, made by the King Abdullah University of Science andTechnology (KAUST). Computing resources for the present work have also beenprovided by the Gauss Centre for Supercomputing under grant pr63no.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.