To 4,000 compute nodes and beyond: Network-aware vertex placement in large-scale graph processing systems

Karim Awara, Hani Jamjoom, Panos Kanlis

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

The explosive growth of "big data" is giving rise to a new breed of large scale graph systems, such as Pregel. This poster describes our ongoing work in characterizing and minimizing the communication cost of Bulk Synchronous Parallel (BSP) graph mining systems, like Pregel, when scaling to 4,096 compute nodes. Existing implementations generally assume a fixed communication cost. This is sufficient in small deployments as the BSP programming model (i.e., overlapping computation and communication) masks small variations in the underlying network. In large scale deployments, such variations can dominate the overall runtime characteristics. In this poster, we first quantify the impact of network communication on the total compute time of a Pregel system. We then propose an efficient vertex placement strategy that subsamples highly connected vertices and applies the Reverse Cuthill-McKee (RCM) algorithm to efficiently partition the input graph and place partitions closer to each other based on their expected communication patterns. We finally describe a vertex replication strategy to further reduce communication overhead.

Original languageEnglish (US)
Title of host publicationProceedings of the SIGCOMM 2013 and Best Papers of the Co-Located Workshops
Pages501-502
Number of pages2
Edition4
DOIs
StatePublished - 2013
EventAnnual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication, ACM SIGCOMM 2013 - Hong Kong, China
Duration: Aug 12 2013Aug 16 2013

Publication series

NameComputer Communication Review
Number4
Volume43
ISSN (Print)0146-4833
ISSN (Electronic)1943-5819

Other

OtherAnnual Conference of the ACM Special Interest Group on Data Communication on the Applications, Technologies, Architectures, and Protocols for Computer Communication, ACM SIGCOMM 2013
Country/TerritoryChina
CityHong Kong
Period08/12/1308/16/13

Keywords

  • bulk synchronous parallel
  • extreme scaling
  • graph mining systems
  • network topology
  • vertex placement

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'To 4,000 compute nodes and beyond: Network-aware vertex placement in large-scale graph processing systems'. Together they form a unique fingerprint.

Cite this