{"title":"Session-based fault-tolerant design patterns","authors":"N. Ivaki, Filipe Araújo, F. Barros","doi":"10.1109/PADSW.2014.7097875","DOIUrl":null,"url":null,"abstract":"Despite offering reliability against dropped and reordered packets, the widely adopted Transmission Control Protocol (TCP) provides nearly no recovery options for longterm network outages. When the network fails, developers must rollback the application to some coherent state on their own, using error-prone solutions. Overcoming this limitation is, therefore, a deeply investigated and challenging problem. Existing solutions range from transport-layer to application-layer protocols, including additions to TCP, usually transparent to the application. None of these solutions is perfect, because they all impact TCP's simplicity, performance or ubiquity, if not all. To avoid these shortcomings, we contain TCP connection crashes inside a single session layer exposed as a sockets interface. Based on this interface, we create a blocking and a non-blocking fault-tolerant design pattern. We explore the blocking design in an open source File Transfer Protocol (FTP) server and perform a thorough evaluation of performance, complexity and overhead of both designs. Our results show that using one of the patterns to tolerate TCP connection crashes, in new or existing applications, involves a very limited effort and negligible penalties.","PeriodicalId":421740,"journal":{"name":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PADSW.2014.7097875","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
Despite offering reliability against dropped and reordered packets, the widely adopted Transmission Control Protocol (TCP) provides nearly no recovery options for longterm network outages. When the network fails, developers must rollback the application to some coherent state on their own, using error-prone solutions. Overcoming this limitation is, therefore, a deeply investigated and challenging problem. Existing solutions range from transport-layer to application-layer protocols, including additions to TCP, usually transparent to the application. None of these solutions is perfect, because they all impact TCP's simplicity, performance or ubiquity, if not all. To avoid these shortcomings, we contain TCP connection crashes inside a single session layer exposed as a sockets interface. Based on this interface, we create a blocking and a non-blocking fault-tolerant design pattern. We explore the blocking design in an open source File Transfer Protocol (FTP) server and perform a thorough evaluation of performance, complexity and overhead of both designs. Our results show that using one of the patterns to tolerate TCP connection crashes, in new or existing applications, involves a very limited effort and negligible penalties.