基于机器学习和仿真的高性能处理器温度预测

Carlton Knox, Zihao Yuan, A. Coskun
{"title":"基于机器学习和仿真的高性能处理器温度预测","authors":"Carlton Knox, Zihao Yuan, A. Coskun","doi":"10.1115/ipack2022-96751","DOIUrl":null,"url":null,"abstract":"\n Emerging thermal management policies for high-power processors often rely on the temperature readings from on-chip digital thermal sensors. However, thermal sensors may not accurately measure the maximum temperature on chip. This is because thermal hot spots are typically located near important CPU components, limiting the power and physical space available for thermal sensors. As a result, sensors usually need to be placed some distance away from the hot spots. Additionally, on-chip thermal sensors also operate within an error margin, which could under/over-estimate the temperature readings. Prior methods introduced machine learning algorithms for predicting chip temperatures trained with Infrared (IR) camera measurements of the physical chip to construct accurate on-chip thermal profiles. While such methods produce an accurate model, the thermal imaging setup is expensive, and it can be time-consuming to collect and process the temperature data for a physical chip. This paper proposes a simulation-based method of using a machine learning regression model to predict a chip’s full temperature map based solely on the current power usage, core utilization, and measured sensor temperatures. The proposed model is trained and evaluated based on data generated from performance, power, and thermal simulations for the Intel i7 6950× Extreme Edition processor. When running a set of realistic benchmarks, this model is able to accurately predict temperatures within a root mean squared error (RMSE) of less than 0.25°C. The proposed model’s accuracy is not affected by the placement of the thermal sensors, and the maximum error resulting from the placement of thermal sensors is less than 0.12° C. For a real-world application, the proposed model can be trained based on realistic simulation or measured temperature data, then be applied to predict a chip’s temperature map in real-time. Using actual temperature data measured from an IR camera is more accurate, but the IR camera setup itself is expensive. Using simulation data to train the machine learning model is low-cost and more practical than temperature prediction based on an expensive IR camera.","PeriodicalId":117260,"journal":{"name":"ASME 2022 International Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Microsystems","volume":"37 7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Machine Learning and Simulation Based Temperature Prediction on High-Performance Processors\",\"authors\":\"Carlton Knox, Zihao Yuan, A. Coskun\",\"doi\":\"10.1115/ipack2022-96751\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Emerging thermal management policies for high-power processors often rely on the temperature readings from on-chip digital thermal sensors. However, thermal sensors may not accurately measure the maximum temperature on chip. This is because thermal hot spots are typically located near important CPU components, limiting the power and physical space available for thermal sensors. As a result, sensors usually need to be placed some distance away from the hot spots. Additionally, on-chip thermal sensors also operate within an error margin, which could under/over-estimate the temperature readings. Prior methods introduced machine learning algorithms for predicting chip temperatures trained with Infrared (IR) camera measurements of the physical chip to construct accurate on-chip thermal profiles. While such methods produce an accurate model, the thermal imaging setup is expensive, and it can be time-consuming to collect and process the temperature data for a physical chip. This paper proposes a simulation-based method of using a machine learning regression model to predict a chip’s full temperature map based solely on the current power usage, core utilization, and measured sensor temperatures. The proposed model is trained and evaluated based on data generated from performance, power, and thermal simulations for the Intel i7 6950× Extreme Edition processor. When running a set of realistic benchmarks, this model is able to accurately predict temperatures within a root mean squared error (RMSE) of less than 0.25°C. The proposed model’s accuracy is not affected by the placement of the thermal sensors, and the maximum error resulting from the placement of thermal sensors is less than 0.12° C. For a real-world application, the proposed model can be trained based on realistic simulation or measured temperature data, then be applied to predict a chip’s temperature map in real-time. Using actual temperature data measured from an IR camera is more accurate, but the IR camera setup itself is expensive. Using simulation data to train the machine learning model is low-cost and more practical than temperature prediction based on an expensive IR camera.\",\"PeriodicalId\":117260,\"journal\":{\"name\":\"ASME 2022 International Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Microsystems\",\"volume\":\"37 7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ASME 2022 International Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Microsystems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1115/ipack2022-96751\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ASME 2022 International Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Microsystems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1115/ipack2022-96751","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

新兴的高功率处理器热管理策略通常依赖于片上数字热传感器的温度读数。然而,热传感器可能无法准确测量芯片上的最高温度。这是因为热热点通常位于重要的CPU组件附近,限制了热传感器可用的功率和物理空间。因此,传感器通常需要放置在距离热点一定距离的地方。此外,片上热传感器也在误差范围内工作,这可能会低估/高估温度读数。先前的方法采用机器学习算法来预测芯片温度,并使用红外(IR)相机对物理芯片进行测量,以构建精确的芯片上热分布图。虽然这些方法产生了一个精确的模型,但热成像设置是昂贵的,并且收集和处理物理芯片的温度数据可能很耗时。本文提出了一种基于仿真的方法,该方法使用机器学习回归模型来预测芯片的全温度图,仅基于当前功耗,核心利用率和测量的传感器温度。基于Intel i7 6950x Extreme Edition处理器的性能、功耗和热模拟生成的数据,对所提出的模型进行了训练和评估。当运行一组实际的基准测试时,该模型能够在小于0.25°C的均方根误差(RMSE)内准确预测温度。该模型的精度不受热传感器放置位置的影响,热传感器放置位置导致的最大误差小于0.12°c。对于实际应用,该模型可以基于真实的模拟或测量温度数据进行训练,然后应用于实时预测芯片的温度图。使用红外相机测量的实际温度数据更准确,但红外相机的设置本身是昂贵的。使用模拟数据来训练机器学习模型是低成本的,比基于昂贵的红外相机的温度预测更实用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Machine Learning and Simulation Based Temperature Prediction on High-Performance Processors
Emerging thermal management policies for high-power processors often rely on the temperature readings from on-chip digital thermal sensors. However, thermal sensors may not accurately measure the maximum temperature on chip. This is because thermal hot spots are typically located near important CPU components, limiting the power and physical space available for thermal sensors. As a result, sensors usually need to be placed some distance away from the hot spots. Additionally, on-chip thermal sensors also operate within an error margin, which could under/over-estimate the temperature readings. Prior methods introduced machine learning algorithms for predicting chip temperatures trained with Infrared (IR) camera measurements of the physical chip to construct accurate on-chip thermal profiles. While such methods produce an accurate model, the thermal imaging setup is expensive, and it can be time-consuming to collect and process the temperature data for a physical chip. This paper proposes a simulation-based method of using a machine learning regression model to predict a chip’s full temperature map based solely on the current power usage, core utilization, and measured sensor temperatures. The proposed model is trained and evaluated based on data generated from performance, power, and thermal simulations for the Intel i7 6950× Extreme Edition processor. When running a set of realistic benchmarks, this model is able to accurately predict temperatures within a root mean squared error (RMSE) of less than 0.25°C. The proposed model’s accuracy is not affected by the placement of the thermal sensors, and the maximum error resulting from the placement of thermal sensors is less than 0.12° C. For a real-world application, the proposed model can be trained based on realistic simulation or measured temperature data, then be applied to predict a chip’s temperature map in real-time. Using actual temperature data measured from an IR camera is more accurate, but the IR camera setup itself is expensive. Using simulation data to train the machine learning model is low-cost and more practical than temperature prediction based on an expensive IR camera.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信