© 2014 IEEE. At extreme scale, irregularities in the structure of scale-free graphs such as social network graphs limit our ability to analyze these important and growing datasets. A key challenge is the presence of high-degree vertices (hubs), that leads to parallel workload and storage imbalances. The imbalances occur because existing partitioning techniques are not able to effectively partition high-degree vertices. We present techniques to distribute storage, computation, and communication of hubs for extreme scale graphs in distributed memory supercomputers. To balance the hub processing workload, we distribute hub data structures and related computation among a set of delegates. The delegates coordinate using highly optimized, yet portable, asynchronous broadcast and reduction operations. We demonstrate scalability of our new algorithmic technique using Breadth-First Search (BFS), Single Source Shortest Path (SSSP), K-Core Decomposition, and Page-Rank on synthetically generated scale-free graphs. Our results show excellent scalability on large scale-free graphs up to 131K cores of the IBM BG/P, and outperform the best known Graph500 performance on BG/P Intrepid by 15%
|Original language||English (US)|
|Title of host publication||SC14: International Conference for High Performance Computing, Networking, Storage and Analysis|
|Publisher||Institute of Electrical and Electronics Engineers (IEEE)|
|Number of pages||11|
|State||Published - Nov 2014|
Bibliographical noteKAUST Repository Item: Exported on 2020-10-01
Acknowledged KAUST grant number(s): KUS-C1-016-04
Acknowledgements: This work was partially performed under the auspicesof the U.S. Department of Energy by Lawrence LivermoreNational Laboratory under Contract DE-AC52-07NA27344(LLNL-CONF-658291). Funding was partially provided byLDRD 13-ERD-025. Portions of experiments were performedat the Livermore Computing facility. This research usedresources of the Argonne Leadership Computing Facility(ALCF) at Argonne National Laboratory, which is supportedby the Office of Science of the U.S. Department of Energyunder contract DE-AC02-06CH11357. ALCF resourcesprovided through an INCITE 2012 award for the Fault-Oblivious Exascale Computing Environment project. This researchsupported in part by NSF awards CNS-0551685, CCF-0833199, CCF-0830753, IIS-0917266, by DOE awards DEAC02-06CH11357, DE-NA0002376 B575363, by Samsung,by Award KUS-C1-016-04, made by King Abdullah Universityof Science and Technology (KAUST). Pearce was supportedin part by a Lawrence Scholar fellowship at LLNL.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.