PPAS-MiCs: Peak-power-aware scheduling of fault-tolerant mixed-criticality systems

IF 3.8 3区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Sustainable Computing-Informatics & Systems Pub Date : 2025-07-08 DOI:10.1016/j.suscom.2025.101156

Shayan Shokri , Sepideh Safari , Shaahin Hessabi , Mohsen Ansari

{"title":"PPAS-MiCs: Peak-power-aware scheduling of fault-tolerant mixed-criticality systems","authors":"Shayan Shokri , Sepideh Safari , Shaahin Hessabi , Mohsen Ansari","doi":"10.1016/j.suscom.2025.101156","DOIUrl":null,"url":null,"abstract":"<div><div>Multi-core platforms have become the dominant trend in designing Mixed-Criticality Systems (MCSs). The most well-known MCS is the dual-criticality system, which consists of high and low-criticality tasks. With the increase in the number of cores, the occurrence rate of faults has also increased in MCSs. For this reason, employing fault-tolerant techniques has become crucial. Although exploiting fault-tolerant techniques can improve system reliability, it might lead to increasing the temperature of the system beyond safe limits. In this paper, we present peak-power-aware scheduling for MCSs that employs the checkpointing technique while guaranteeing the timing, reliability, and thermal design power (TDP) constraints. In the proposed method, first, the minimum number of checkpoints for each task is calculated and assigned to the different execution sections of the tasks. Afterward, the cores are divided into safety-critical and non-safety-critical pairs, and tasks are mapped to cores and scheduled. It should be noted that this is a preliminary division and does not mean isolating the cores from each other. At each dedicated point in the schedule, if the TDP is violated, tasks are shifted from the last checkpoint until this constraint is not violated. Finally, the existing slack times are exploited to improve the QoS and reduce the average power consumption of the system. The proposed method is compared with the state-of-the-art fault-tolerant techniques, resulting in 35.6% and 36.5% improvement in all scenarios and in feasible scenarios, respectively, while the TDP constraint is not violated.</div></div>","PeriodicalId":48686,"journal":{"name":"Sustainable Computing-Informatics & Systems","volume":"47 ","pages":"Article 101156"},"PeriodicalIF":3.8000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sustainable Computing-Informatics & Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210537925000770","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Multi-core platforms have become the dominant trend in designing Mixed-Criticality Systems (MCSs). The most well-known MCS is the dual-criticality system, which consists of high and low-criticality tasks. With the increase in the number of cores, the occurrence rate of faults has also increased in MCSs. For this reason, employing fault-tolerant techniques has become crucial. Although exploiting fault-tolerant techniques can improve system reliability, it might lead to increasing the temperature of the system beyond safe limits. In this paper, we present peak-power-aware scheduling for MCSs that employs the checkpointing technique while guaranteeing the timing, reliability, and thermal design power (TDP) constraints. In the proposed method, first, the minimum number of checkpoints for each task is calculated and assigned to the different execution sections of the tasks. Afterward, the cores are divided into safety-critical and non-safety-critical pairs, and tasks are mapped to cores and scheduled. It should be noted that this is a preliminary division and does not mean isolating the cores from each other. At each dedicated point in the schedule, if the TDP is violated, tasks are shifted from the last checkpoint until this constraint is not violated. Finally, the existing slack times are exploited to improve the QoS and reduce the average power consumption of the system. The proposed method is compared with the state-of-the-art fault-tolerant techniques, resulting in 35.6% and 36.5% improvement in all scenarios and in feasible scenarios, respectively, while the TDP constraint is not violated.

查看原文本刊更多论文

PPAS-MiCs：容错混合临界系统的峰值功率感知调度

多核平台已成为混合临界系统设计的主流趋势。最著名的MCS是双临界系统，它由高临界和低临界任务组成。随着核数的增加，mcs的故障发生率也在增加。由于这个原因，采用容错技术变得至关重要。尽管利用容错技术可以提高系统的可靠性，但它可能导致系统温度升高到超出安全范围。在本文中，我们提出了mcs的峰值功率感知调度，该调度采用检查点技术，同时保证了时序、可靠性和热设计功率（TDP）约束。在该方法中，首先计算每个任务的最小检查点数量，并将其分配给任务的不同执行部分。然后，将核心划分为安全关键对和非安全关键对，并将任务映射到核心并进行调度。应该注意的是，这是一个初步的划分，并不意味着将核心彼此隔离。在计划中的每个专用点，如果违反了TDP，任务将从最后一个检查点转移，直到不违反此约束。最后，利用现有的空闲时间来提高服务质量，降低系统的平均功耗。在不违反TDP约束的情况下，该方法在所有场景和可行场景下的容错性能分别提高了35.6%和36.5%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Sustainable Computing-Informatics & Systems COMPUTER SCIENCE, HARDWARE & ARCHITECTUREC-COMPUTER SCIENCE, INFORMATION SYSTEMS

CiteScore

10.70

自引率

4.40%

发文量

142

期刊介绍： Sustainable computing is a rapidly expanding research area spanning the fields of computer science and engineering, electrical engineering as well as other engineering disciplines. The aim of Sustainable Computing: Informatics and Systems (SUSCOM) is to publish the myriad research findings related to energy-aware and thermal-aware management of computing resource. Equally important is a spectrum of related research issues such as applications of computing that can have ecological and societal impacts. SUSCOM publishes original and timely research papers and survey articles in current areas of power, energy, temperature, and environment related research areas of current importance to readers. SUSCOM has an editorial board comprising prominent researchers from around the world and selects competitively evaluated peer-reviewed papers.