Fork-join and data-driven execution models on multi-core architectures: Case study of the FMM

Abdelhalim Amer, Naoya Maruyama, Miquel Pericàs, Kenjiro Taura, Rio Yokota, Satoshi Matsuoka

Research output: Chapter in Book/Report/Conference proceedingConference contribution

17 Scopus citations


Extracting maximum performance of multi-core architectures is a difficult task primarily due to bandwidth limitations of the memory subsystem and its complex hierarchy. In this work, we study the implications of fork-join and data-driven execution models on this type of architecture at the level of task parallelism. For this purpose, we use a highly optimized fork-join based implementation of the FMM and extend it to a data-driven implementation using a distributed task scheduling approach. This study exposes some limitations of the conventional fork-join implementation in terms of synchronization overheads. We find that these are not negligible and their elimination by the data-driven method, with a careful data locality strategy, was beneficial. Experimental evaluation of both methods on state-of-the-art multi-socket multi-core architectures showed up to 22% speed-ups of the data-driven approach compared to the original method. We demonstrate that a data-driven execution of FMM not only improves performance by avoiding global synchronization overheads but also reduces the memory-bandwidth pressure caused by memory-intensive computations. © 2013 Springer-Verlag.
Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science
PublisherSpringer Nature
Number of pages12
ISBN (Print)9783642387494
StatePublished - 2013

Bibliographical note

KAUST Repository Item: Exported on 2020-10-01

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)


Dive into the research topics of 'Fork-join and data-driven execution models on multi-core architectures: Case study of the FMM'. Together they form a unique fingerprint.

Cite this