AMA: Asynchronous management of accelerators for task-based programming models

Judit Planas, Rosa M. Badia, Eduard Ayguade, Jesus Labarta

Research output: Chapter in Book/Report/Conference proceedingConference contribution

6 Scopus citations


Computational science has benefited in the last years from emerging accelerators that increase the performance of scientific simulations, but using these devices hinders the programming task. This paper presents AMA: a set of optimization techniques to efficiently manage multiaccelerator systems. AMA maximizes the overlap of computation and communication in a blocking-free way. Then, we can use such spare time to do other work while waiting for device operations. Implemented on top of a task-based framework, the experimental evaluation of AMA on a quad-GPU node shows that we reach the performance of a hand-tuned native CUDA code, with the advantage of fully hiding the device management. In addition, we obtain up to more than 2x performance speed-up with respect to the original framework implementation.
Original languageEnglish (US)
Title of host publicationProcedia Computer Science
PublisherElsevier BV
Number of pages10
StatePublished - Jun 1 2015
Externally publishedYes

Bibliographical note

KAUST Repository Item: Exported on 2022-06-24
Acknowledgements: European Commission (HiPEAC-3 Network of Excellence, FP7-ICT 287759), Intel-BSC Exas-cale Lab and IBM/BSC Exascale Initiative collaboration, Spanish Ministry of Education (FPU), Computación de Altas Prestaciones VI (TIN2012-34557), Generalitat de Catalunya (2014-SGR-1051). We thank KAUST IT Research Computing for granting access to their machines.
This publication acknowledges KAUST support, but has no KAUST affiliated authors.


Dive into the research topics of 'AMA: Asynchronous management of accelerators for task-based programming models'. Together they form a unique fingerprint.

Cite this