Abstract
We extend the capability of space-time geostatistical modeling using algebraic approximations, illustrating application-expected accuracy worthy of double precision from majority low-precision computations and low-rank matrix approximations. We exploit the mathematical structure of the dense covariance matrix whose inverse action and determinant are repeatedly required in Gaussian log-likelihood optimization. Geostatistics augments first-principles modeling approaches for the prediction of environmental phenomena given the availability of measurements at a large number of locations; however, traditional Cholesky-based approaches grow cubically in complexity, gating practical extension to continental and global datasets now available. We combine the linear algebraic contributions of mixed-precision and low-rank computations within a tile based Cholesky solver with on-demand casting of precisions and dynamic runtime support from PaRSEC to orchestrate tasks and data movement. Our adaptive approach scales on various systems and leverages the Fujitsu A64FX nodes of Fugaku to achieve up to 12X performance speedup against the highly optimized dense Cholesky implementation.
Original language | English (US) |
---|---|
Title of host publication | SC22: International Conference for High Performance Computing, Networking, Storage and Analysis |
Publisher | IEEE |
DOIs | |
State | Published - Feb 23 2023 |
Bibliographical note
KAUST Repository Item: Exported on 2023-03-01Acknowledgements: For computer time, this research used the resources of the Supercomputing Laboratory (KSL) Shaheen II at King Abdullah University of Science & Technology (KAUST) in Thuwal Saudi Arabia and the supercomputer Fugaku provided by RIKEN through the HPCI System Research Project (Project ID: hp200310 and ra010009). We effusively thank Mitsuhisa Sato / Miwako Tsuji (Riken) and Bilel Hadri (KSL) for their support on Fugaku and Shaheen II, respectively, as well as Ruqing Xu from The University of Tokyo for providing the BLIS FP32 SHGEMM kernel on A64FX.