Efficient implementations of the classical molecular dynamics (MD) method for Lennard-Jones particle systems are considered. Not only general algorithms but also techniques that are efficient for some specific CPU architectures are also explained. A simple spatialdecomposition-based strategy is adopted for parallelization. By utilizing the developed code, benchmark simulations are performed on a HITACHI SR16000/J2 system consisting of IBM POWER6 processors which are 4.7 GHz at the National Institute for Fusion Science (NIFS) and an SGI Altix ICE 8400EX system consisting of Intel Xeon processors which are 2.93 GHz at the Institute for Solid State Physics (ISSP), the University of Tokyo. The parallelization efficiency of the largest run, consisting of 4.1 billion particles with 8192 MPI processes, is about 73% relative to that of the smallest run with 128 MPI processes at NIFS, and it is about 66% relative to that of the smallest run with 4 MPI processes at ISSP. The factors causing the parallel overhead are investigated. It is found that fluctuations of the execution time of each process degrade the parallel efficiency. These fluctuations may be due to the interference of the operating system, which is known as OS Jitter.
|Original language||English (US)|
|Number of pages||33|
|Journal||Progress of Theoretical Physics|
|State||Published - Aug 1 2011|
Bibliographical noteKAUST Repository Item: Exported on 2020-10-01
Acknowledged KAUST grant number(s): KUK-I1-005-04
Acknowledgements: The authors would like to thank Y. Kanada, S. Takagi, and T. Boku for fruitfuldiscussions. Some parts of the implementation techniques are owing to N. Sodaand M. Itakura. HW thanks M. Isobe for useful information of past studies. Thiswork was supported by KAUST GRP (KUK-I1-005-04), Grants-in-Aid for ScientificResearch (Contracts No. 19740235), and the NIFS Collaboration Research program(NIFS10KTBS006). The computations were carried out using the facilitiesof National Institute for Fusion Science; the Information Technology Center, theUniversity of Tokyo; the Supercomputer Center, Institute for Solid State Physics,University of Tokyo; and the Research Institute for Information Technology, KyushuUniversity.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.