{"title":"Self-Healing Protocol: Repairing Schedules Online after Link Failures in Time-Triggered Networks","authors":"Francisco Pozo, G. Rodríguez-Navas, H. Hansson","doi":"10.1109/DSN48987.2021.00028","DOIUrl":null,"url":null,"abstract":"Switched networks following the time-triggered paradigm rely on static schedules that determine the communication pattern over each link. In order to tolerate link failures, methods based on spatial redundancy and based on resynthesis and replacement of schedules have been proposed. These methods, however, do not scale to larger networks, which may be needed e.g. for future large-scale cyberphysical systems. We propose a distributed Self-Healing Protocol (SHP) that, instead of recomputing the whole schedule, repairs the existent schedule at runtime. For that, it relies on the coordination among the nodes of the network to redefine the repair problem as a number of local synthesis problems of significantly smaller size, which are solved in parallel by the nodes that need to reroute the frames affected by link failures. SHP exhibits a high success rate compared to full rescheduling, as well as remarkable scalability; it repairs the schedule in milliseconds, whereas rescheduling may require minutes for large networks.","PeriodicalId":222512,"journal":{"name":"2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSN48987.2021.00028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Switched networks following the time-triggered paradigm rely on static schedules that determine the communication pattern over each link. In order to tolerate link failures, methods based on spatial redundancy and based on resynthesis and replacement of schedules have been proposed. These methods, however, do not scale to larger networks, which may be needed e.g. for future large-scale cyberphysical systems. We propose a distributed Self-Healing Protocol (SHP) that, instead of recomputing the whole schedule, repairs the existent schedule at runtime. For that, it relies on the coordination among the nodes of the network to redefine the repair problem as a number of local synthesis problems of significantly smaller size, which are solved in parallel by the nodes that need to reroute the frames affected by link failures. SHP exhibits a high success rate compared to full rescheduling, as well as remarkable scalability; it repairs the schedule in milliseconds, whereas rescheduling may require minutes for large networks.