Abstract
Many parallel scientific applications spend a significant amount of time reading and writing data files. Collective I/O operations allow to optimize the file access of a process group by redistributing data across processes to match the data layout on the file system. In most parallel I/O libraries, the implementation of collective I/O operations is based on the two-phase I/O algorithm, which consists of a communication phase and a file access phase. This papers evaluates various design options for overlapping two internal cycles of the two-phase I/O algorithm, and explores using different data transfer primitives for the shuffle phase, including non-blocking two-sided communication and multiple versions of one-sided communication. The results indicate that overlap algorithms incorporating asynchronous I/O outperform overlapping approaches that only rely on nonblocking communication. However, in the vast majority of the testcases one-sided communication did not lead to performance improvements over two-sided communication.
Original language | English (US) |
---|---|
Title of host publication | 2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) |
Publisher | IEEE |
Pages | 1044-1051 |
Number of pages | 8 |
ISBN (Print) | 9781728174457 |
DOIs | |
State | Published - Jul 28 2020 |
Externally published | Yes |
Bibliographical note
KAUST Repository Item: Exported on 2022-06-30Acknowledgements: Partial support for this work was provided by the National Science Foundation under Award No. SI2-SSI 1663887. The authors would also like to thank KAUST Supercomputing laboratory for providing compute time on the Ibex cluster for this project.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.