Application-Transparent Fault Tolerance in Distributed Systems

  • We present a new software architecture in which all concepts necessary to achieve fault tolerance can be added to an appli- cation automatically without any source code changes. As a case study, we consider the problem of providing a reliable service despite node failures by executing a group of replicat- ed servers. Replica creation and management as well as fail- ure detection and recovery are performed automatically by a separate fault tolerance layer (ft-layer) which is inserted be- tween the server application and the operating system kernel. The layer is invisible for the application since it provides the same functional interface as the operating system kernel, thus making the fault tolerance property of the service completely transparent for the application. A major advantage of our ar- chitecture is that the layer encapsulates both fault tolerance mechanisms and policies. This allows for maximum flexibility in the choice of appropriate methods for fault tolerance with- out any changes in the application code.

Download full text files

Export metadata

Additional Services

Search Google Scholar
Metadaten
Author:Thomas Becker
URN:urn:nbn:de:hbz:386-kluedo-1886
Document Type:Preprint
Language of publication:English
Year of Completion:1999
Year of first Publication:1999
Publishing Institution:Technische Universität Kaiserslautern
Date of the Publication (Server):2000/04/03
Faculties / Organisational entities:Kaiserslautern - Fachbereich Informatik
DDC-Cassification:0 Allgemeines, Informatik, Informationswissenschaft / 004 Informatik
Licence (German):Standard gemäß KLUEDO-Leitlinien vor dem 27.05.2011