P. Mammen, Noman Bashir, Ramachandra Rao Kolluri, Eun Kung Lee, P. Shenoy
{"title":"袖口:绿色人工智能集群的可配置不确定性驱动预测框架","authors":"P. Mammen, Noman Bashir, Ramachandra Rao Kolluri, Eun Kung Lee, P. Shenoy","doi":"10.1145/3575813.3595203","DOIUrl":null,"url":null,"abstract":"AI applications are driving the need for large dedicated GPU clusters, which are highly energy- and carbon-intensive. To efficiently operate these clusters, operators leverage workload forecasts that inform resource allocation decisions to save energy without sacrificing performance. The traditional forecasting methods provide a single-point forecast and do not expose the uncertainty about their predictions, which can lead to an unexpected loss in performance. In this paper, we present an uncertainty-driven GPU demand forecasting framework that exposes the uncertainty in its predictions and provides a mechanism to configure the trade-off between energy savings and performance. We evaluate our approach using multiple GPU workload traces and demonstrate that the forecasting framework, called CUFF, outperforms state-of-the-art point predictions. CUFF predictor meets performance goals 83% of the time compared to 7.6% for the point predictions under high GPU demand. Furthermore, CUFF knob enables users to configure up to 98% performance target while providing 26% energy savings, comparable value to point forecasts that only ensure 68% performance target.","PeriodicalId":359352,"journal":{"name":"Proceedings of the 14th ACM International Conference on Future Energy Systems","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"CUFF: A Configurable Uncertainty-driven Forecasting Framework for Green AI Clusters\",\"authors\":\"P. Mammen, Noman Bashir, Ramachandra Rao Kolluri, Eun Kung Lee, P. Shenoy\",\"doi\":\"10.1145/3575813.3595203\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"AI applications are driving the need for large dedicated GPU clusters, which are highly energy- and carbon-intensive. To efficiently operate these clusters, operators leverage workload forecasts that inform resource allocation decisions to save energy without sacrificing performance. The traditional forecasting methods provide a single-point forecast and do not expose the uncertainty about their predictions, which can lead to an unexpected loss in performance. In this paper, we present an uncertainty-driven GPU demand forecasting framework that exposes the uncertainty in its predictions and provides a mechanism to configure the trade-off between energy savings and performance. We evaluate our approach using multiple GPU workload traces and demonstrate that the forecasting framework, called CUFF, outperforms state-of-the-art point predictions. CUFF predictor meets performance goals 83% of the time compared to 7.6% for the point predictions under high GPU demand. Furthermore, CUFF knob enables users to configure up to 98% performance target while providing 26% energy savings, comparable value to point forecasts that only ensure 68% performance target.\",\"PeriodicalId\":359352,\"journal\":{\"name\":\"Proceedings of the 14th ACM International Conference on Future Energy Systems\",\"volume\":\"30 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 14th ACM International Conference on Future Energy Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3575813.3595203\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 14th ACM International Conference on Future Energy Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3575813.3595203","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
CUFF: A Configurable Uncertainty-driven Forecasting Framework for Green AI Clusters
AI applications are driving the need for large dedicated GPU clusters, which are highly energy- and carbon-intensive. To efficiently operate these clusters, operators leverage workload forecasts that inform resource allocation decisions to save energy without sacrificing performance. The traditional forecasting methods provide a single-point forecast and do not expose the uncertainty about their predictions, which can lead to an unexpected loss in performance. In this paper, we present an uncertainty-driven GPU demand forecasting framework that exposes the uncertainty in its predictions and provides a mechanism to configure the trade-off between energy savings and performance. We evaluate our approach using multiple GPU workload traces and demonstrate that the forecasting framework, called CUFF, outperforms state-of-the-art point predictions. CUFF predictor meets performance goals 83% of the time compared to 7.6% for the point predictions under high GPU demand. Furthermore, CUFF knob enables users to configure up to 98% performance target while providing 26% energy savings, comparable value to point forecasts that only ensure 68% performance target.