Trimmer:经济高效的云数据中心深度学习自动调优

Q1 Computer Science

IEEE Cloud Computing Pub Date : 2022-07-01 DOI:10.1109/CLOUD55607.2022.00061

Damian Borowiec, G. Yeung, A. Friday, Richard Harper, P. Garraghan

{"title":"Trimmer:经济高效的云数据中心深度学习自动调优","authors":"Damian Borowiec, G. Yeung, A. Friday, Richard Harper, P. Garraghan","doi":"10.1109/CLOUD55607.2022.00061","DOIUrl":null,"url":null,"abstract":"Cloud datacenters capable of provisioning high performance Machine Learning-as-a-Service (MLaaS) at reduced resource cost is achieved via auto-tuning: automated tensor program optimization of Deep Learning models to minimize inference latency within a hardware device. However given the extensive heterogeneity of Deep Learning models, libraries, and hardware devices, performing auto-tuning within Cloud datacenters incurs a significant time, compute resource, and energy cost of which state-of-the-art auto-tuning is not designed to mitigate. In this paper we propose Trimmer, a high performance and cost-efficient Deep Learning auto-tuning framework for Cloud datacenters. Trimmer maximizes DL model performance and tensor program cost-efficiency by preempting tensor program implementations exhibiting poor optimization improvement; and applying an ML-based filtering method to replace expensive low performing tensor programs to provide greater likelihood of selecting low latency tensor programs. Through an empirical study exploring the cost of DL model optimization techniques, our analysis indicates that 26–43% of total energy is expended on measuring tensor program implementations that do not positively contribute towards auto-tuning. Experiment results show that Trimmer achieves high auto-tuning cost-efficiency across different DL models, and reduces auto-tuning energy use by 21.8–40.9% for Cloud clusters whilst achieving DL model latency equivalent to state-of-the-art techniques.","PeriodicalId":54281,"journal":{"name":"IEEE Cloud Computing","volume":"84 1","pages":"374-384"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Trimmer: Cost-Efficient Deep Learning Auto-tuning for Cloud Datacenters\",\"authors\":\"Damian Borowiec, G. Yeung, A. Friday, Richard Harper, P. Garraghan\",\"doi\":\"10.1109/CLOUD55607.2022.00061\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cloud datacenters capable of provisioning high performance Machine Learning-as-a-Service (MLaaS) at reduced resource cost is achieved via auto-tuning: automated tensor program optimization of Deep Learning models to minimize inference latency within a hardware device. However given the extensive heterogeneity of Deep Learning models, libraries, and hardware devices, performing auto-tuning within Cloud datacenters incurs a significant time, compute resource, and energy cost of which state-of-the-art auto-tuning is not designed to mitigate. In this paper we propose Trimmer, a high performance and cost-efficient Deep Learning auto-tuning framework for Cloud datacenters. Trimmer maximizes DL model performance and tensor program cost-efficiency by preempting tensor program implementations exhibiting poor optimization improvement; and applying an ML-based filtering method to replace expensive low performing tensor programs to provide greater likelihood of selecting low latency tensor programs. Through an empirical study exploring the cost of DL model optimization techniques, our analysis indicates that 26–43% of total energy is expended on measuring tensor program implementations that do not positively contribute towards auto-tuning. Experiment results show that Trimmer achieves high auto-tuning cost-efficiency across different DL models, and reduces auto-tuning energy use by 21.8–40.9% for Cloud clusters whilst achieving DL model latency equivalent to state-of-the-art techniques.\",\"PeriodicalId\":54281,\"journal\":{\"name\":\"IEEE Cloud Computing\",\"volume\":\"84 1\",\"pages\":\"374-384\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Cloud Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CLOUD55607.2022.00061\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLOUD55607.2022.00061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 0

摘要

能够以更低的资源成本提供高性能机器学习即服务(MLaaS)的云数据中心是通过自动调优实现的:深度学习模型的自动张量程序优化，以最大限度地减少硬件设备内的推理延迟。然而，考虑到深度学习模型、库和硬件设备的广泛异质性，在云数据中心内执行自动调优会产生大量的时间、计算资源和能源成本，而最先进的自动调优并不能减轻这些成本。在本文中，我们提出了Trimmer，一个高性能和经济高效的云数据中心深度学习自动调优框架。Trimmer通过抢占表现出较差优化改进的张量程序实现最大化DL模型性能和张量程序成本效率;并应用基于ml的过滤方法来替换昂贵的低性能张量程序，以提供更大的选择低延迟张量程序的可能性。通过一项探索深度学习模型优化技术成本的实证研究，我们的分析表明，总能量的26-43%花费在测量对自动调谐没有积极贡献的张量程序实现上。实验结果表明，Trimmer在不同的深度学习模型中实现了很高的自动调优成本效益，并将云集群的自动调优能耗降低了21.8-40.9%，同时实现了与最先进技术相当的深度学习模型延迟。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Trimmer: Cost-Efficient Deep Learning Auto-tuning for Cloud Datacenters

Cloud datacenters capable of provisioning high performance Machine Learning-as-a-Service (MLaaS) at reduced resource cost is achieved via auto-tuning: automated tensor program optimization of Deep Learning models to minimize inference latency within a hardware device. However given the extensive heterogeneity of Deep Learning models, libraries, and hardware devices, performing auto-tuning within Cloud datacenters incurs a significant time, compute resource, and energy cost of which state-of-the-art auto-tuning is not designed to mitigate. In this paper we propose Trimmer, a high performance and cost-efficient Deep Learning auto-tuning framework for Cloud datacenters. Trimmer maximizes DL model performance and tensor program cost-efficiency by preempting tensor program implementations exhibiting poor optimization improvement; and applying an ML-based filtering method to replace expensive low performing tensor programs to provide greater likelihood of selecting low latency tensor programs. Through an empirical study exploring the cost of DL model optimization techniques, our analysis indicates that 26–43% of total energy is expended on measuring tensor program implementations that do not positively contribute towards auto-tuning. Experiment results show that Trimmer achieves high auto-tuning cost-efficiency across different DL models, and reduces auto-tuning energy use by 21.8–40.9% for Cloud clusters whilst achieving DL model latency equivalent to state-of-the-art techniques.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Cloud Computing Computer Science-Computer Networks and Communications

CiteScore

11.20

自引率

0.00%

发文量

期刊介绍： Cessation. IEEE Cloud Computing is committed to the timely publication of peer-reviewed articles that provide innovative research ideas, applications results, and case studies in all areas of cloud computing. Topics relating to novel theory, algorithms, performance analyses and applications of techniques are covered. More specifically: Cloud software, Cloud security, Trade-offs between privacy and utility of cloud, Cloud in the business environment, Cloud economics, Cloud governance, Migrating to the cloud, Cloud standards, Development tools, Backup and recovery, Interoperability, Applications management, Data analytics, Communications protocols, Mobile cloud, Private clouds, Liability issues for data loss on clouds, Data integration, Big data, Cloud education, Cloud skill sets, Cloud energy consumption, The architecture of cloud computing, Applications in commerce, education, and industry, Infrastructure as a Service (IaaS), Platform as a Service (PaaS), Software as a Service (SaaS), Business Process as a Service (BPaaS)