能量感知神经结构选择与超参数优化

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) Pub Date : 2022-05-01 DOI:10.1109/IPDPSW55747.2022.00125

Nathan C Frey, Dan Zhao, Simon Axelrod, Michael Jones, David Bestor, V. Gadepally, Rafael Gómez-Bombarelli, S. Samsi

{"title":"能量感知神经结构选择与超参数优化","authors":"Nathan C Frey, Dan Zhao, Simon Axelrod, Michael Jones, David Bestor, V. Gadepally, Rafael Gómez-Bombarelli, S. Samsi","doi":"10.1109/IPDPSW55747.2022.00125","DOIUrl":null,"url":null,"abstract":"Artificial Intelligence (AI) and Deep Learning in particular have increasing computational requirements, with a corresponding increase in energy consumption. There is a tremendous opportunity to reduce the computational cost and environmental impact of deep learning by accelerating neural network architecture search and hyperparameter optimization, as well as explicitly designing neural architectures that optimize for both energy efficiency and performance. Here, we introduce a framework called training performance estimation (TPE), which builds upon existing techniques for training speed estimation in order to monitor energy consumption and rank model performance-without training models to convergence-saving up to 90% of time and energy of the full training budget. We benchmark TPE in the computationally intensive, well-studied domain of computer vision and in the emerging field of graph neural networks for machine-learned inter-atomic potentials, an important domain for scientific discovery with heavy computational demands. We propose variants of early stopping that generalize this common regularization technique to account for energy costs and study the energy costs of deploying increasingly complex, knowledge-informed architectures for AI-accelerated molecular dynamics and image classification. Our work enables immediate, significant energy savings across the entire pipeline of model development and deployment and suggests new research directions for energy-aware, knowledge-informed model architecture development.","PeriodicalId":286968,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"26 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Energy-aware neural architecture selection and hyperparameter optimization\",\"authors\":\"Nathan C Frey, Dan Zhao, Simon Axelrod, Michael Jones, David Bestor, V. Gadepally, Rafael Gómez-Bombarelli, S. Samsi\",\"doi\":\"10.1109/IPDPSW55747.2022.00125\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Artificial Intelligence (AI) and Deep Learning in particular have increasing computational requirements, with a corresponding increase in energy consumption. There is a tremendous opportunity to reduce the computational cost and environmental impact of deep learning by accelerating neural network architecture search and hyperparameter optimization, as well as explicitly designing neural architectures that optimize for both energy efficiency and performance. Here, we introduce a framework called training performance estimation (TPE), which builds upon existing techniques for training speed estimation in order to monitor energy consumption and rank model performance-without training models to convergence-saving up to 90% of time and energy of the full training budget. We benchmark TPE in the computationally intensive, well-studied domain of computer vision and in the emerging field of graph neural networks for machine-learned inter-atomic potentials, an important domain for scientific discovery with heavy computational demands. We propose variants of early stopping that generalize this common regularization technique to account for energy costs and study the energy costs of deploying increasingly complex, knowledge-informed architectures for AI-accelerated molecular dynamics and image classification. Our work enables immediate, significant energy savings across the entire pipeline of model development and deployment and suggests new research directions for energy-aware, knowledge-informed model architecture development.\",\"PeriodicalId\":286968,\"journal\":{\"name\":\"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)\",\"volume\":\"26 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPSW55747.2022.00125\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW55747.2022.00125","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

特别是人工智能(AI)和深度学习的计算需求越来越大，能耗也相应增加。通过加速神经网络架构搜索和超参数优化，以及明确设计优化能效和性能的神经架构，可以降低深度学习的计算成本和环境影响，这是一个巨大的机会。在这里，我们引入了一个名为训练性能估计(TPE)的框架，该框架建立在现有的训练速度估计技术的基础上，以监控能量消耗和对模型性能进行排名——在没有训练模型收敛的情况下——节省高达90%的全部训练预算的时间和精力。我们在计算密集型、研究充分的计算机视觉领域和用于机器学习原子间势的图神经网络新兴领域对TPE进行基准测试，这是一个具有大量计算需求的科学发现的重要领域。我们提出了早期停止的变体，推广了这种常见的正则化技术，以考虑能源成本，并研究为人工智能加速的分子动力学和图像分类部署日益复杂、知识丰富的架构的能源成本。我们的工作在整个模型开发和部署的管道中实现了即时的、显著的能源节约，并为能源意识、知识灵通的模型体系结构开发提出了新的研究方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Energy-aware neural architecture selection and hyperparameter optimization

Artificial Intelligence (AI) and Deep Learning in particular have increasing computational requirements, with a corresponding increase in energy consumption. There is a tremendous opportunity to reduce the computational cost and environmental impact of deep learning by accelerating neural network architecture search and hyperparameter optimization, as well as explicitly designing neural architectures that optimize for both energy efficiency and performance. Here, we introduce a framework called training performance estimation (TPE), which builds upon existing techniques for training speed estimation in order to monitor energy consumption and rank model performance-without training models to convergence-saving up to 90% of time and energy of the full training budget. We benchmark TPE in the computationally intensive, well-studied domain of computer vision and in the emerging field of graph neural networks for machine-learned inter-atomic potentials, an important domain for scientific discovery with heavy computational demands. We propose variants of early stopping that generalize this common regularization technique to account for energy costs and study the energy costs of deploying increasingly complex, knowledge-informed architectures for AI-accelerated molecular dynamics and image classification. Our work enables immediate, significant energy savings across the entire pipeline of model development and deployment and suggests new research directions for energy-aware, knowledge-informed model architecture development.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

自引率

0.00%

发文量