{"title":"基于机器学习和仿真的高性能处理器温度预测","authors":"Carlton Knox, Zihao Yuan, A. Coskun","doi":"10.1115/ipack2022-96751","DOIUrl":null,"url":null,"abstract":"\n Emerging thermal management policies for high-power processors often rely on the temperature readings from on-chip digital thermal sensors. However, thermal sensors may not accurately measure the maximum temperature on chip. This is because thermal hot spots are typically located near important CPU components, limiting the power and physical space available for thermal sensors. As a result, sensors usually need to be placed some distance away from the hot spots. Additionally, on-chip thermal sensors also operate within an error margin, which could under/over-estimate the temperature readings. Prior methods introduced machine learning algorithms for predicting chip temperatures trained with Infrared (IR) camera measurements of the physical chip to construct accurate on-chip thermal profiles. While such methods produce an accurate model, the thermal imaging setup is expensive, and it can be time-consuming to collect and process the temperature data for a physical chip. This paper proposes a simulation-based method of using a machine learning regression model to predict a chip’s full temperature map based solely on the current power usage, core utilization, and measured sensor temperatures. The proposed model is trained and evaluated based on data generated from performance, power, and thermal simulations for the Intel i7 6950× Extreme Edition processor. When running a set of realistic benchmarks, this model is able to accurately predict temperatures within a root mean squared error (RMSE) of less than 0.25°C. The proposed model’s accuracy is not affected by the placement of the thermal sensors, and the maximum error resulting from the placement of thermal sensors is less than 0.12° C. For a real-world application, the proposed model can be trained based on realistic simulation or measured temperature data, then be applied to predict a chip’s temperature map in real-time. Using actual temperature data measured from an IR camera is more accurate, but the IR camera setup itself is expensive. Using simulation data to train the machine learning model is low-cost and more practical than temperature prediction based on an expensive IR camera.","PeriodicalId":117260,"journal":{"name":"ASME 2022 International Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Microsystems","volume":"37 7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Machine Learning and Simulation Based Temperature Prediction on High-Performance Processors\",\"authors\":\"Carlton Knox, Zihao Yuan, A. Coskun\",\"doi\":\"10.1115/ipack2022-96751\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Emerging thermal management policies for high-power processors often rely on the temperature readings from on-chip digital thermal sensors. However, thermal sensors may not accurately measure the maximum temperature on chip. This is because thermal hot spots are typically located near important CPU components, limiting the power and physical space available for thermal sensors. As a result, sensors usually need to be placed some distance away from the hot spots. Additionally, on-chip thermal sensors also operate within an error margin, which could under/over-estimate the temperature readings. Prior methods introduced machine learning algorithms for predicting chip temperatures trained with Infrared (IR) camera measurements of the physical chip to construct accurate on-chip thermal profiles. While such methods produce an accurate model, the thermal imaging setup is expensive, and it can be time-consuming to collect and process the temperature data for a physical chip. This paper proposes a simulation-based method of using a machine learning regression model to predict a chip’s full temperature map based solely on the current power usage, core utilization, and measured sensor temperatures. The proposed model is trained and evaluated based on data generated from performance, power, and thermal simulations for the Intel i7 6950× Extreme Edition processor. When running a set of realistic benchmarks, this model is able to accurately predict temperatures within a root mean squared error (RMSE) of less than 0.25°C. The proposed model’s accuracy is not affected by the placement of the thermal sensors, and the maximum error resulting from the placement of thermal sensors is less than 0.12° C. For a real-world application, the proposed model can be trained based on realistic simulation or measured temperature data, then be applied to predict a chip’s temperature map in real-time. Using actual temperature data measured from an IR camera is more accurate, but the IR camera setup itself is expensive. Using simulation data to train the machine learning model is low-cost and more practical than temperature prediction based on an expensive IR camera.\",\"PeriodicalId\":117260,\"journal\":{\"name\":\"ASME 2022 International Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Microsystems\",\"volume\":\"37 7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ASME 2022 International Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Microsystems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1115/ipack2022-96751\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ASME 2022 International Technical Conference and Exhibition on Packaging and Integration of Electronic and Photonic Microsystems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1115/ipack2022-96751","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Machine Learning and Simulation Based Temperature Prediction on High-Performance Processors
Emerging thermal management policies for high-power processors often rely on the temperature readings from on-chip digital thermal sensors. However, thermal sensors may not accurately measure the maximum temperature on chip. This is because thermal hot spots are typically located near important CPU components, limiting the power and physical space available for thermal sensors. As a result, sensors usually need to be placed some distance away from the hot spots. Additionally, on-chip thermal sensors also operate within an error margin, which could under/over-estimate the temperature readings. Prior methods introduced machine learning algorithms for predicting chip temperatures trained with Infrared (IR) camera measurements of the physical chip to construct accurate on-chip thermal profiles. While such methods produce an accurate model, the thermal imaging setup is expensive, and it can be time-consuming to collect and process the temperature data for a physical chip. This paper proposes a simulation-based method of using a machine learning regression model to predict a chip’s full temperature map based solely on the current power usage, core utilization, and measured sensor temperatures. The proposed model is trained and evaluated based on data generated from performance, power, and thermal simulations for the Intel i7 6950× Extreme Edition processor. When running a set of realistic benchmarks, this model is able to accurately predict temperatures within a root mean squared error (RMSE) of less than 0.25°C. The proposed model’s accuracy is not affected by the placement of the thermal sensors, and the maximum error resulting from the placement of thermal sensors is less than 0.12° C. For a real-world application, the proposed model can be trained based on realistic simulation or measured temperature data, then be applied to predict a chip’s temperature map in real-time. Using actual temperature data measured from an IR camera is more accurate, but the IR camera setup itself is expensive. Using simulation data to train the machine learning model is low-cost and more practical than temperature prediction based on an expensive IR camera.