Abstract
We implement an efficient data compression algorithm that reduces the memory footprint of spatial datasets generated during scientific simulations. Storing regularly these datasets is typically needed for checkpoint/restart or for post-processing purposes. Our lossy compression approach, codenamed HLRcompress (https://gitlab.mis.mpg.de/rok/HLRcompress), combines a hierarchical low-rank approximation technique with binary compression. This novel hybrid method is agnostic to the particular domain of application. We study the impact of HLRcompress on accuracy using synthetic datasets to demonstrate the software capabilities, including robustness and versatility. We assess different algebraic compression methods and report performance results on various parallel architectures. We then integrate it into a workflow of a direct numerical simulation solver for turbulent combustion on distributed-memory systems. We compress the generated snapshots during time integration using accuracy thresholds for each individual chemical species, without degrading the practical accuracy of the overall pressure and temperature. We eventually compare against state-of-the-art compression software. Our implementation achieves on average greater than 100-fold compression of the original size of the datasets.
Original language | English (US) |
---|---|
Title of host publication | Euro-Par 2022: Parallel Processing |
Publisher | Springer International Publishing |
Pages | 403-418 |
Number of pages | 16 |
ISBN (Print) | 9783031125966 |
DOIs | |
State | Published - Aug 1 2022 |
Bibliographical note
KAUST Repository Item: Exported on 2022-09-14Acknowledgements: For computer time, this research used Shaheen-2 Supercomputer hosted at the Supercomputing Laboratory at KAUST.
ASJC Scopus subject areas
- Theoretical Computer Science
- General Computer Science