As interactions and dependencies within distributed reactive systems increase, the
problem of detecting failures which depend on the exact situation and environmental
conditions they occur in grows. As a result, not only the detection of failures is
increasingly difficult, but also the differentiation between the symptoms of a fault,
and the actual fault itself, i. e., the cause of a failure.
This thesis proposes an efficient approach for the analysis of distributed reactive
systems at runtime. It introduces a framework, referred to as monitoring-based runtime
reflection framework, for the detection of failures as well as identification of
their causes. As an interface to the framework, this thesis proposes a specification
language, Salt, to express desired system properties, which then have to hold at
all times while a system executes. Detection of failures in the framework is based
upon monitoring systems with respect to their associated properties, defined in the
high-level specification language Salt. Salt specifications are translatable into the
real-time temporal logic TLTL as well as in the untimed temporal logic LTL, and
as such can also be used with other temporal logic verification frameworks, such as
model checkers. For both logics, an efficient monitor generation procedure is developed
for either monitoring qualitative assertions about a system’s behaviour in the
untimed case, or quantitative assertions reflecting dense real-time constraints. In
order to reflect the semantics of LTL, respectively TLTL, with respect to a finite observational
trace in a suitable and unambiguous manner, a 3-valued interpretation is
proposed. The corresponding 3-valued monitoring procedure is shown to be optimal
with respect to the space-complexity of the generated monitors, and able to detect
all minimal bad prefixes of non-conforming system behaviour. Based on the results of
the monitors, a dedicated failure diagnosis is performed to identify possible explanations
for an observed deviation. As such, diagnosis can either confirm that a monitor
detected the root cause for a failure, or indicate that the fault is located elsewhere,
and possibly outside the scope of the monitored system. In the runtime reflection
framework, diagnosis is based upon the principles of first-order model-based diagnosis,
but developed in the propositional domain. The propositional diagnosis problem
is then shown to correspond to a deterministic implementation of a solution to the
#SAT problem for which a computationally efficient realisation is introduced.
This thesis develops both the theoretical foundations for runtime reflection as well as
efficient means for its implementation.
«
As interactions and dependencies within distributed reactive systems increase, the
problem of detecting failures which depend on the exact situation and environmental
conditions they occur in grows. As a result, not only the detection of failures is
increasingly difficult, but also the differentiation between the symptoms of a fault,
and the actual fault itself, i. e., the cause of a failure.
This thesis proposes an efficient approach for the analysis of distributed reactive
systems at run...
»