TY - JOUR
T1 - State restoration in Ada 95
T2 - A portable approach to supporting software fault tolerance
AU - Rogers, P.
AU - Wellings, A. J.
PY - 2000/3/15
Y1 - 2000/3/15
N2 - Studies indicate that techniques for tolerating hardware faults are so effective that software design errors are the leading cause of all faults encountered. To handle these unanticipated software faults, two main approaches have been proposed: N-version programming and recovery blocks. Both are based on the concept of design diversity: the assumption that different designs will exhibit different faults (if any) for the same inputs and will, therefore, provide alternatives for each other. Both approaches have advantages, but this paper focuses upon recovery blocks; specifically, the requirement to save and restore application state. Judicious saving of state has been described as `checkpointing' for over a decade. Using the object-oriented features of the revised Ada language (Ada 95) - a language widely used in this domain - we present three portable implementations of a checkpointing facility and discuss the trade-offs offered by each. Results of the implementation of these mechanisms are used to highlight both the strengths and weaknesses of some of the object-oriented features of Ada. We then show a reusable implementation of recovery blocks illustrating the checkpointing schemes. A performance analysis is made and measurements are presented in support of the analysis.
AB - Studies indicate that techniques for tolerating hardware faults are so effective that software design errors are the leading cause of all faults encountered. To handle these unanticipated software faults, two main approaches have been proposed: N-version programming and recovery blocks. Both are based on the concept of design diversity: the assumption that different designs will exhibit different faults (if any) for the same inputs and will, therefore, provide alternatives for each other. Both approaches have advantages, but this paper focuses upon recovery blocks; specifically, the requirement to save and restore application state. Judicious saving of state has been described as `checkpointing' for over a decade. Using the object-oriented features of the revised Ada language (Ada 95) - a language widely used in this domain - we present three portable implementations of a checkpointing facility and discuss the trade-offs offered by each. Results of the implementation of these mechanisms are used to highlight both the strengths and weaknesses of some of the object-oriented features of Ada. We then show a reusable implementation of recovery blocks illustrating the checkpointing schemes. A performance analysis is made and measurements are presented in support of the analysis.
KW - ERROR RECOVERY
KW - ATOMIC ACTIONS
KW - SYSTEMS
KW - RELIABILITY
UR - http://www.scopus.com/inward/record.url?scp=0033904998&partnerID=8YFLogxK
U2 - 10.1016/S0164-1212(99)00100-4
DO - 10.1016/S0164-1212(99)00100-4
M3 - Article
VL - 50
SP - 237
EP - 255
JO - Journal of Systems and Software
JF - Journal of Systems and Software
IS - 3
ER -