Abstracting Failures Away From Stateful Dataflow Systems | KTH MSc Thesis Defense 2024

Описание к видео Abstracting Failures Away From Stateful Dataflow Systems | KTH MSc Thesis Defense 2024

Abstracting Failures Away From Stateful Dataflow Systems
Aleksey Veresov
The link to the thesis publication is to appear here
Supervisors: Philipp Haller and Jonas Spenger
Examiner: Roberto Guanciale
Opponent: Gustav Andersson Kasche

Systems distributed across several computers are essential for modern infrastructure, and their reliability is reliant on the correctness of the constituent computers' failure-handling protocols. Correctness in such systems is often understood as failure transparency, a property that enables to use a system as if no failures occur in it; in other words, it states that there is a high-level model of the system, from which the failures are abstracted away. This work proves that failure transparency is provided by the Asynchronous Barrier Snapshotting protocol used in Apache Flink, a prominent distributed stateful dataflow system. This protocol is formalized in operational semantics for the first time in this thesis. As no prior definition of failure transparency is suitable for this formalization, a novel definition is proposed, applicable to systems expressed in small-step operational semantics with explicit failure-related rules. The work demonstrates how failure transparency can be proven by reasoning about each execution as a whole, presenting a proof technique convenient for proofs about checkpoint-recovery protocols. The results are a first step towards a verified stateful dataflow programming stack.

The slides: https://veresov.pro/msc-slides.pdf

Комментарии

Информация по комментариям в разработке