{"title":"通过运行时层冻结、模型量化和提前停止实现高能效神经网络训练","authors":"Álvaro Domingo Reguero , Silverio Martínez-Fernández , Roberto Verdecchia","doi":"10.1016/j.csi.2024.103906","DOIUrl":null,"url":null,"abstract":"<div><h3>Background:</h3><p>In the last years, neural networks have been massively adopted by industry and research in a wide variety of contexts. Neural network milestones are generally reached by scaling up computation, completely disregarding the carbon footprint required for the associated computations. This trend has become unsustainable given the ever-growing use of deep learning, and could cause irreversible damage to the environment of our planet if it is not addressed soon.</p></div><div><h3>Objective:</h3><p>In this study, we aim to analyze not only the effects of different energy saving methods for neural networks but also the effects of the moment of intervention, and what makes certain moments optimal.</p></div><div><h3>Method:</h3><p>We developed a novel dataset by training convolutional neural networks in 12 different computer vision datasets and applying runtime decisions regarding layer freezing, model quantization and early stopping at different epochs in each run. We then fit an auto-regressive prediction model on the data collected capable to predict the accuracy and energy consumption achieved on future epochs for different methods. The predictions on accuracy and energy are used to estimate the optimal training path.</p></div><div><h3>Results:</h3><p>Following the predictions of the model can save 56.5% of energy consumed while also increasing validation accuracy by 2.38% by avoiding overfitting.The prediction model developed can predict the validation accuracy with a 8.4% of error, the energy consumed with a 14.3% of error and the trade-off between both with a 8.9% of error.</p></div><div><h3>Conclusions:</h3><p>This prediction model could potentially be used by the training algorithm to decide which methods apply to the model and at what moment in order to maximize the accuracy-energy trade-off.</p></div>","PeriodicalId":50635,"journal":{"name":"Computer Standards & Interfaces","volume":"92 ","pages":"Article 103906"},"PeriodicalIF":4.1000,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0920548924000758/pdfft?md5=9fe0b023bddfc875c825b0c53e63af06&pid=1-s2.0-S0920548924000758-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Energy-efficient neural network training through runtime layer freezing, model quantization, and early stopping\",\"authors\":\"Álvaro Domingo Reguero , Silverio Martínez-Fernández , Roberto Verdecchia\",\"doi\":\"10.1016/j.csi.2024.103906\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background:</h3><p>In the last years, neural networks have been massively adopted by industry and research in a wide variety of contexts. Neural network milestones are generally reached by scaling up computation, completely disregarding the carbon footprint required for the associated computations. This trend has become unsustainable given the ever-growing use of deep learning, and could cause irreversible damage to the environment of our planet if it is not addressed soon.</p></div><div><h3>Objective:</h3><p>In this study, we aim to analyze not only the effects of different energy saving methods for neural networks but also the effects of the moment of intervention, and what makes certain moments optimal.</p></div><div><h3>Method:</h3><p>We developed a novel dataset by training convolutional neural networks in 12 different computer vision datasets and applying runtime decisions regarding layer freezing, model quantization and early stopping at different epochs in each run. We then fit an auto-regressive prediction model on the data collected capable to predict the accuracy and energy consumption achieved on future epochs for different methods. The predictions on accuracy and energy are used to estimate the optimal training path.</p></div><div><h3>Results:</h3><p>Following the predictions of the model can save 56.5% of energy consumed while also increasing validation accuracy by 2.38% by avoiding overfitting.The prediction model developed can predict the validation accuracy with a 8.4% of error, the energy consumed with a 14.3% of error and the trade-off between both with a 8.9% of error.</p></div><div><h3>Conclusions:</h3><p>This prediction model could potentially be used by the training algorithm to decide which methods apply to the model and at what moment in order to maximize the accuracy-energy trade-off.</p></div>\",\"PeriodicalId\":50635,\"journal\":{\"name\":\"Computer Standards & Interfaces\",\"volume\":\"92 \",\"pages\":\"Article 103906\"},\"PeriodicalIF\":4.1000,\"publicationDate\":\"2024-08-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S0920548924000758/pdfft?md5=9fe0b023bddfc875c825b0c53e63af06&pid=1-s2.0-S0920548924000758-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Standards & Interfaces\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0920548924000758\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Standards & Interfaces","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0920548924000758","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Energy-efficient neural network training through runtime layer freezing, model quantization, and early stopping
Background:
In the last years, neural networks have been massively adopted by industry and research in a wide variety of contexts. Neural network milestones are generally reached by scaling up computation, completely disregarding the carbon footprint required for the associated computations. This trend has become unsustainable given the ever-growing use of deep learning, and could cause irreversible damage to the environment of our planet if it is not addressed soon.
Objective:
In this study, we aim to analyze not only the effects of different energy saving methods for neural networks but also the effects of the moment of intervention, and what makes certain moments optimal.
Method:
We developed a novel dataset by training convolutional neural networks in 12 different computer vision datasets and applying runtime decisions regarding layer freezing, model quantization and early stopping at different epochs in each run. We then fit an auto-regressive prediction model on the data collected capable to predict the accuracy and energy consumption achieved on future epochs for different methods. The predictions on accuracy and energy are used to estimate the optimal training path.
Results:
Following the predictions of the model can save 56.5% of energy consumed while also increasing validation accuracy by 2.38% by avoiding overfitting.The prediction model developed can predict the validation accuracy with a 8.4% of error, the energy consumed with a 14.3% of error and the trade-off between both with a 8.9% of error.
Conclusions:
This prediction model could potentially be used by the training algorithm to decide which methods apply to the model and at what moment in order to maximize the accuracy-energy trade-off.
期刊介绍:
The quality of software, well-defined interfaces (hardware and software), the process of digitalisation, and accepted standards in these fields are essential for building and exploiting complex computing, communication, multimedia and measuring systems. Standards can simplify the design and construction of individual hardware and software components and help to ensure satisfactory interworking.
Computer Standards & Interfaces is an international journal dealing specifically with these topics.
The journal
• Provides information about activities and progress on the definition of computer standards, software quality, interfaces and methods, at national, European and international levels
• Publishes critical comments on standards and standards activities
• Disseminates user''s experiences and case studies in the application and exploitation of established or emerging standards, interfaces and methods
• Offers a forum for discussion on actual projects, standards, interfaces and methods by recognised experts
• Stimulates relevant research by providing a specialised refereed medium.