Parallel fault tolerant algorithms for parabolic problems

Hatem Ltaief*, Marc Garbey, Edgar Gabriel

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

With increasing number of processors available on nowadays high performance computing systems, the mean time between failure of these machines is decreasing. The ability of hardware and software components to handle process failures is therefore getting increasingly important. The objective of this paper is to present a fault tolerant approach for the implicit forward time integration of parabolic problems using explicit formulas. This technique allows the application to recover from process failures and to reconstruct the lost data of the failed process(es) avoiding the roll-back operation required in most checkpoint-restart schemes. The benchmark used to highlight the new algorithms is the two dimensional heat equation solved with a first order implicit Euler scheme.

Original languageEnglish (US)
Title of host publicationEuro-Par 2006 Parallel Processing - 12th International Euro-Par Conference, Proceedings
PublisherSpringer Verlag
Pages700-709
Number of pages10
ISBN (Print)3540377832, 9783540377832
DOIs
StatePublished - 2006
Externally publishedYes
Event12th International Euro-Par Conference 2006 - Lisbon, Portugal
Duration: Aug 28 2006Sep 1 2006

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume4128 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other12th International Euro-Par Conference 2006
Country/TerritoryPortugal
CityLisbon
Period08/28/0609/1/06

ASJC Scopus subject areas

  • Theoretical Computer Science
  • General Computer Science

Fingerprint

Dive into the research topics of 'Parallel fault tolerant algorithms for parabolic problems'. Together they form a unique fingerprint.

Cite this