Abstract
Communication overhead has been traditionally the primary metric for evaluating rollback-recovery protocols. This paper reexamines the prominence of this metric in light of the recent increases in processor and network speeds. We introduce a new recovery algorithm for a family of rollback-recovery protocols based on logging. The new algorithm incurs a higher communication overhead during recovery than previous algorithms, but it requires less access to stable storage and imposes no restrictions on the execution of live processes. Experimental results show that the new algorithm performs better than one that is optimized for low communication overhead. These results suggest that in modern environments, latency in accessing stable storage and intrusion of a particular algorithm on the execution of live processes are more important than the number of messages exchanged during recovery.
Original language | English (US) |
---|---|
Pages | 74-79 |
Number of pages | 6 |
DOIs | |
State | Published - 1995 |
Externally published | Yes |
Event | Proceedings of the 14th Annual ACM Symposium on Principles of Distributed Computing - Ottawa, Can Duration: Aug 20 1995 → Aug 23 1995 |
Conference
Conference | Proceedings of the 14th Annual ACM Symposium on Principles of Distributed Computing |
---|---|
City | Ottawa, Can |
Period | 08/20/95 → 08/23/95 |
ASJC Scopus subject areas
- Software
- Hardware and Architecture
- Computer Networks and Communications