Application-Transparent Fault Tolerance in Distributed Systems

  • We present a new software architecture in which all concepts necessary to achieve fault tolerance can be added to an appli- cation automatically without any source code changes. As a case study, we consider the problem of providing a reliable service despite node failures by executing a group of replicat- ed servers. Replica creation and management as well as fail- ure detection and recovery are performed automatically by a separate fault tolerance layer (ft-layer) which is inserted be- tween the server application and the operating system kernel. The layer is invisible for the application since it provides the same functional interface as the operating system kernel, thus making the fault tolerance property of the service completely transparent for the application. A major advantage of our ar- chitecture is that the layer encapsulates both fault tolerance mechanisms and policies. This allows for maximum flexibility in the choice of appropriate methods for fault tolerance with- out any changes in the application code.

Download full text files

Export metadata

Additional Services

Share in Twitter Search Google Scholar
Author:Thomas Becker
URN (permanent link):urn:nbn:de:hbz:386-kluedo-1886
Document Type:Preprint
Language of publication:English
Year of Completion:1999
Year of Publication:1999
Publishing Institute:Technische Universität Kaiserslautern
Date of the Publication (Server):2000/04/03
Faculties / Organisational entities:Fachbereich Informatik
DDC-Cassification:0 Allgemeines, Informatik, Informationswissenschaft / 004 Informatik
Licence (German):Standard gemäß KLUEDO-Leitlinien vor dem 27.05.2011