{"title":"Teaching practical realistic verification of distributed algorithms in Erlang with TLA+","authors":"Peter Zeller, Annette Bieniusa, Carla Ferreira","doi":"10.1145/3406085.3409009","DOIUrl":null,"url":null,"abstract":"Distributed systems are inherently complex as they need to address the interplay between features like communication, concurrency, and failure. Due to the inherent complexity of these interacting features, it is typically not possible to systematically test these kind of systems; yet, unexpected and unlikely combinations of events might cause corner cases that are hard to find. But since these systems are running typically for long durations, these events are likely to materialize eventually and must be handled correctly. Caught in such a dilemma, students are able to experience the benefits of applying verification tools to check their own algorithms and implementations. Having executable models with automatically generated executions allows them to experiment with different solutions by iteratively adapting and refining their algorithms. In this experience report, we report on our experience of teaching verification in a (hands-on) distributed systems course. We argue that broadcast algorithms provide a sweet spot in design and verification complexity. To this end, we give an implementation of these algorithms in Erlang and derive a TLA+ specification. TLA+ is a formal language for describing and reasoning about distributed and concurrent systems and provides a model checker, TLC, among other things. Our study reveals interesting parallels between the Erlang and TLA+ code, while exposing the challenges of formally modeling communication and parallelism in distributed systems. Presenting selected aspects of our course design, we aim to motivate the feasibility and need for introducing verification in close correspondence to programming tasks.","PeriodicalId":202303,"journal":{"name":"Proceedings of the 19th ACM SIGPLAN International Workshop on Erlang","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th ACM SIGPLAN International Workshop on Erlang","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3406085.3409009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Distributed systems are inherently complex as they need to address the interplay between features like communication, concurrency, and failure. Due to the inherent complexity of these interacting features, it is typically not possible to systematically test these kind of systems; yet, unexpected and unlikely combinations of events might cause corner cases that are hard to find. But since these systems are running typically for long durations, these events are likely to materialize eventually and must be handled correctly. Caught in such a dilemma, students are able to experience the benefits of applying verification tools to check their own algorithms and implementations. Having executable models with automatically generated executions allows them to experiment with different solutions by iteratively adapting and refining their algorithms. In this experience report, we report on our experience of teaching verification in a (hands-on) distributed systems course. We argue that broadcast algorithms provide a sweet spot in design and verification complexity. To this end, we give an implementation of these algorithms in Erlang and derive a TLA+ specification. TLA+ is a formal language for describing and reasoning about distributed and concurrent systems and provides a model checker, TLC, among other things. Our study reveals interesting parallels between the Erlang and TLA+ code, while exposing the challenges of formally modeling communication and parallelism in distributed systems. Presenting selected aspects of our course design, we aim to motivate the feasibility and need for introducing verification in close correspondence to programming tasks.