{"title":"Software cooling approach enables efficient and cost-effective thermal management of multicore systems","authors":"Kaihang Zhou , Yimin Xuan , Dinghua Hu , Qiang Li","doi":"10.1016/j.ijheatmasstransfer.2025.126937","DOIUrl":null,"url":null,"abstract":"<div><div>The relentless pursuit of high-performance electronic devices has driven semiconductor technology toward relentless miniaturization and integration. While this advancement enhances computational capabilities, it concurrently reduces chip heat capacities and diminishes thermal inertia. Traditional hardware-based thermal management strategies face inherent limitations, including temporal heat transfer mismatches, physical size constraints, and prohibitive economic costs. To address these challenges, this study proposes a software-driven thermal management approach that achieves cost-effective thermal regulation under constrained hardware package conditions. More importantly, it effectively mitigates temperature rises caused by transient thermal pulse—a capability lacking in traditional hardware cooling. Long short-term memory (LSTM) model, a type of recurrent neural network (RNN) has been successfully integrated into our framework to enable precise temperature prediction. The combination of LSTM and ant colony optimization (ACO) algorithm enables the scheduler to output the best allocation scheme. Results indicate that this approach achieves more than 6℃ decrease of mean peak temperature and 8% decrease of percentage of hotspots, while also reducing communication energy by 15% compared to existing software level thermal management technologies. External cooling resources (thermoelectric cooler) are incorporated into the task allocation algorithm for the first time. In the presence of local TEC, our approach performs best thermal performance. The feasibility of this approach under different workloads and platform sizes is also validated. Such software cooling approach provides valuable insights into the field of thermal management for electronic devices.</div></div>","PeriodicalId":336,"journal":{"name":"International Journal of Heat and Mass Transfer","volume":"244 ","pages":"Article 126937"},"PeriodicalIF":5.0000,"publicationDate":"2025-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Heat and Mass Transfer","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0017931025002789","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MECHANICAL","Score":null,"Total":0}
引用次数: 0
Abstract
The relentless pursuit of high-performance electronic devices has driven semiconductor technology toward relentless miniaturization and integration. While this advancement enhances computational capabilities, it concurrently reduces chip heat capacities and diminishes thermal inertia. Traditional hardware-based thermal management strategies face inherent limitations, including temporal heat transfer mismatches, physical size constraints, and prohibitive economic costs. To address these challenges, this study proposes a software-driven thermal management approach that achieves cost-effective thermal regulation under constrained hardware package conditions. More importantly, it effectively mitigates temperature rises caused by transient thermal pulse—a capability lacking in traditional hardware cooling. Long short-term memory (LSTM) model, a type of recurrent neural network (RNN) has been successfully integrated into our framework to enable precise temperature prediction. The combination of LSTM and ant colony optimization (ACO) algorithm enables the scheduler to output the best allocation scheme. Results indicate that this approach achieves more than 6℃ decrease of mean peak temperature and 8% decrease of percentage of hotspots, while also reducing communication energy by 15% compared to existing software level thermal management technologies. External cooling resources (thermoelectric cooler) are incorporated into the task allocation algorithm for the first time. In the presence of local TEC, our approach performs best thermal performance. The feasibility of this approach under different workloads and platform sizes is also validated. Such software cooling approach provides valuable insights into the field of thermal management for electronic devices.
期刊介绍:
International Journal of Heat and Mass Transfer is the vehicle for the exchange of basic ideas in heat and mass transfer between research workers and engineers throughout the world. It focuses on both analytical and experimental research, with an emphasis on contributions which increase the basic understanding of transfer processes and their application to engineering problems.
Topics include:
-New methods of measuring and/or correlating transport-property data
-Energy engineering
-Environmental applications of heat and/or mass transfer