Training Green AI Models Using Elite Samples

IF 3.9 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Mohammed Alswaitti;Roberto Verdecchia;Grégoire Danoy;Pascal Bouvry;Johnatan E. Pecero
{"title":"Training Green AI Models Using Elite Samples","authors":"Mohammed Alswaitti;Roberto Verdecchia;Grégoire Danoy;Pascal Bouvry;Johnatan E. Pecero","doi":"10.1109/TSUSC.2025.3544430","DOIUrl":null,"url":null,"abstract":"The substantial increase in AI model training has considerable environmental implications, requiring energy-efficient and sustainable AI practices. On one hand, data-centric approaches show great potential towards training energy-efficient AI models. On the other hand, instance selection methods demonstrate the capability of training AI models with minimised training sets and negligible performance degradation. Despite the growing interest in both topics, the impact of data-centric training set selection on energy efficiency remains to date unexplored. This paper presents an evolutionary-based sampling framework aimed at (i) identifying elite training samples tailored for datasets and model pairs, (ii) comparing model performance and energy efficiency gains against typical model training practice, and (iii) investigating the feasibility of this framework for fostering sustainable model training practices. To evaluate the proposed framework, we conducted an empirical experiment including 8 commonly used AI classification models and 25 publicly available datasets. The results showcase that by considering 10% elite training samples, the models’ performance can show a 50% improvement and remarkable energy savings of 98% compared to the common training practice. In essence, this study establishes a new benchmark for AI researchers and practitioners interested in improving the environmental sustainability of AI model training via data-centric approaches.","PeriodicalId":13268,"journal":{"name":"IEEE Transactions on Sustainable Computing","volume":"10 5","pages":"858-872"},"PeriodicalIF":3.9000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10897883","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Sustainable Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10897883/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

The substantial increase in AI model training has considerable environmental implications, requiring energy-efficient and sustainable AI practices. On one hand, data-centric approaches show great potential towards training energy-efficient AI models. On the other hand, instance selection methods demonstrate the capability of training AI models with minimised training sets and negligible performance degradation. Despite the growing interest in both topics, the impact of data-centric training set selection on energy efficiency remains to date unexplored. This paper presents an evolutionary-based sampling framework aimed at (i) identifying elite training samples tailored for datasets and model pairs, (ii) comparing model performance and energy efficiency gains against typical model training practice, and (iii) investigating the feasibility of this framework for fostering sustainable model training practices. To evaluate the proposed framework, we conducted an empirical experiment including 8 commonly used AI classification models and 25 publicly available datasets. The results showcase that by considering 10% elite training samples, the models’ performance can show a 50% improvement and remarkable energy savings of 98% compared to the common training practice. In essence, this study establishes a new benchmark for AI researchers and practitioners interested in improving the environmental sustainability of AI model training via data-centric approaches.
使用精英样本训练绿色AI模型
人工智能模型训练的大量增加具有相当大的环境影响,需要节能和可持续的人工智能实践。一方面,以数据为中心的方法在训练节能人工智能模型方面显示出巨大的潜力。另一方面,实例选择方法证明了用最小的训练集和可忽略的性能下降来训练AI模型的能力。尽管对这两个主题的兴趣日益浓厚,但以数据为中心的训练集选择对能源效率的影响迄今仍未得到探索。本文提出了一个基于进化的采样框架,旨在(i)识别为数据集和模型对量身定制的精英训练样本,(ii)将模型性能和能源效率收益与典型模型训练实践进行比较,以及(iii)调查该框架促进可持续模型训练实践的可行性。为了评估提出的框架,我们进行了一个实证实验,包括8个常用的人工智能分类模型和25个公开的数据集。结果表明,在考虑10%的精英训练样本的情况下,与普通训练相比,模型的性能可以提高50%,节能98%。从本质上讲,本研究为人工智能研究人员和实践者建立了一个新的基准,他们有兴趣通过以数据为中心的方法提高人工智能模型训练的环境可持续性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Sustainable Computing
IEEE Transactions on Sustainable Computing Mathematics-Control and Optimization
CiteScore
7.70
自引率
2.60%
发文量
54
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信