Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA

George Bosilca*, Aurelien Bouteiller, Anthony Danalis, Mathieu Faverge, Azzam Haidar, Thomas Herault, Jakub Kurzak, Julien Langou, Pierre Lemarinier, Hatem Ltaief, Piotr Luszczek, Asim YarKhan, Jack Dongarra

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

108 Scopus citations

Abstract

We present a method for developing dense linear algebra algorithms that seamlessly scales to thousands of cores. It can be done with our project called DPLASMA (Distributed PLASMA) that uses a novel generic distributed Direct Acyclic Graph Engine (DAGuE). The engine has been designed for high performance computing and thus it enables scaling of tile algorithms, originating in PLASMA, on large distributed memory systems. The underlying DAGuE framework has many appealing features when considering distributed-memory platforms with heterogeneous multicore nodes: DAG representation that is independent of the problem-size, automatic extraction of the communication from the dependencies, overlapping of communication and computation, task prioritization, and architecture-aware scheduling and management of tasks. The originality of this engine lies in its capacity to translate a sequential code with nested-loops into a concise and synthetic format which can then be interpreted and executed in a distributed environment. We present three common dense linear algebra algorithms from PLASMA (Parallel Linear Algebra for Scalable Multi-core Architectures), namely: Cholesky, LU, and QR factorizations, to investigate their data driven expression and execution in a distributed system. We demonstrate through experimental results on the Cray XT5 Kraken system that our DAG-based approach has the potential to achieve sizable fraction of peak performance which is characteristic of the state-of-the-art distributed numerical software on current and emerging architectures.

Original languageEnglish (US)
Title of host publication2011 IEEE International Symposium on Parallel and Distributed Processing, Workshops and Phd Forum, IPDPSW 2011
Pages1432-1441
Number of pages10
DOIs
StatePublished - 2011
Event25th IEEE International Parallel and Distributed Processing Symposium, Workshops and Phd Forum, IPDPSW 2011 - Anchorage, AK, United States
Duration: May 16 2011May 20 2011

Publication series

NameIEEE International Symposium on Parallel and Distributed Processing Workshops and Phd Forum

Other

Other25th IEEE International Parallel and Distributed Processing Symposium, Workshops and Phd Forum, IPDPSW 2011
Country/TerritoryUnited States
CityAnchorage, AK
Period05/16/1105/20/11

Keywords

  • Numerical linear systems
  • Scalable parallel algorithms
  • Scheduling and task partitioning

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Software
  • Theoretical Computer Science

Fingerprint

Dive into the research topics of 'Flexible development of dense linear algebra algorithms on massively parallel architectures with DPLASMA'. Together they form a unique fingerprint.

Cite this