{"title":"Work-in-Progress: Cloud Computing for Time-Triggered Safety-Critical Systems","authors":"Gautam Gala, Javier Castillo Rivera, G. Fohler","doi":"10.1109/rtss52674.2021.00054","DOIUrl":null,"url":null,"abstract":"Safety-critical (SC) applications require high availability, possibility of run-time reconfiguration, and significant resource over-provisioning. Furthermore, they suffer from hardware obsolescence due to the use of custom or specialized hardware. Cloud computing could be used to resolve these issues. Moreover, they could improve SC systems suffering from scalability issues, e.g., the every growing SC railway network. However, SC applications require low latencies and guarantees that are currently not possible on clouds. In this paper, we explore the possibility of enhancing the current cloud computing paradigm by adding a resource management layer to support the deterministic execution of SC applications while providing the benefits of cloud computing principles. We provide a cloud-wide global resource manager that monitors, controls, and coordinates node-level Local Resource Managers (LRMs) placed on each private cloud node. In addition, we give guarantees to SC Virtual Machines (VMs) on each node via a novel CPU-and memory bandwidth-aware Time-triggered (TT) offline scheduling algorithm that generates a scheduling table for use by an LRM. For improving the utilization of the cloud resources, the LRMs provide flexibility to schedule Event-Triggered (ET) SC and non-critical VM at run-time without regenerating the offline scheduling table. We implemented our approach in a KVM-based private cloud and performed experiments to determine the relevant overheads.","PeriodicalId":102789,"journal":{"name":"2021 IEEE Real-Time Systems Symposium (RTSS)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Real-Time Systems Symposium (RTSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/rtss52674.2021.00054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Safety-critical (SC) applications require high availability, possibility of run-time reconfiguration, and significant resource over-provisioning. Furthermore, they suffer from hardware obsolescence due to the use of custom or specialized hardware. Cloud computing could be used to resolve these issues. Moreover, they could improve SC systems suffering from scalability issues, e.g., the every growing SC railway network. However, SC applications require low latencies and guarantees that are currently not possible on clouds. In this paper, we explore the possibility of enhancing the current cloud computing paradigm by adding a resource management layer to support the deterministic execution of SC applications while providing the benefits of cloud computing principles. We provide a cloud-wide global resource manager that monitors, controls, and coordinates node-level Local Resource Managers (LRMs) placed on each private cloud node. In addition, we give guarantees to SC Virtual Machines (VMs) on each node via a novel CPU-and memory bandwidth-aware Time-triggered (TT) offline scheduling algorithm that generates a scheduling table for use by an LRM. For improving the utilization of the cloud resources, the LRMs provide flexibility to schedule Event-Triggered (ET) SC and non-critical VM at run-time without regenerating the offline scheduling table. We implemented our approach in a KVM-based private cloud and performed experiments to determine the relevant overheads.