TY - GEN
T1 - Generic timing fault tolerance using a timely computing base
AU - Casimiro, António
AU - Veríssimo, Paulo
N1 - Generated from Scopus record by KAUST IRTS on 2021-03-16
PY - 2002/1/1
Y1 - 2002/1/1
N2 - Designing applications with timeliness requirements in environments of uncertain synchrony is known to be a difficult problem. In this paper, we follow the perspective of timing fault tolerance: timing errors occur, and they are processed using redundancy, e.g., component replication, to recover and deliver timely service. We introduce a paradigm for generic timing fault tolerance with replicated state machines. The paradigm is based on the existence of Timing Failure Detection with timed completeness and accuracy properties. Generic timing fault tolerance implies the ability to dependably observe the system and to timely notify timing failures, which we discuss in the paper. On the other hand, it ensures replica determinism with respect to time (temporal consistency), and safety in case of spare exhaustion. We show that the paradigm can be addressed and realized in the framework of the Timely Computing Base (TCB) model and architecture. Furthermore, we illustrate the generality of our approach by reviewing previous existing solutions and by showing that in contrast with ours, they only secure a restricted semantics, or simply provide ad-hoc solutions.
AB - Designing applications with timeliness requirements in environments of uncertain synchrony is known to be a difficult problem. In this paper, we follow the perspective of timing fault tolerance: timing errors occur, and they are processed using redundancy, e.g., component replication, to recover and deliver timely service. We introduce a paradigm for generic timing fault tolerance with replicated state machines. The paradigm is based on the existence of Timing Failure Detection with timed completeness and accuracy properties. Generic timing fault tolerance implies the ability to dependably observe the system and to timely notify timing failures, which we discuss in the paper. On the other hand, it ensures replica determinism with respect to time (temporal consistency), and safety in case of spare exhaustion. We show that the paradigm can be addressed and realized in the framework of the Timely Computing Base (TCB) model and architecture. Furthermore, we illustrate the generality of our approach by reviewing previous existing solutions and by showing that in contrast with ours, they only secure a restricted semantics, or simply provide ad-hoc solutions.
UR - http://ieeexplore.ieee.org/document/1028883/
UR - http://www.scopus.com/inward/record.url?scp=0036931518&partnerID=8YFLogxK
U2 - 10.1109/DSN.2002.1028883
DO - 10.1109/DSN.2002.1028883
M3 - Conference contribution
SN - 0769515975
SP - 27
EP - 36
BT - Proceedings of the 2002 International Conference on Dependable Systems and Networks
PB - IEEE Computer Society
ER -