Predicting municipal solid waste generation using artificial intelligence: A hybrid approach of entropy analysis and SHAP for optimal feature selection

IF 7.1 2区 环境科学与生态学 Q1 ENGINEERING, ENVIRONMENTAL
Vahid Nourani , Aida H. Baghanam , Elham Samadi , Selin Uzelaltinbulat
{"title":"Predicting municipal solid waste generation using artificial intelligence: A hybrid approach of entropy analysis and SHAP for optimal feature selection","authors":"Vahid Nourani ,&nbsp;Aida H. Baghanam ,&nbsp;Elham Samadi ,&nbsp;Selin Uzelaltinbulat","doi":"10.1016/j.wasman.2025.115012","DOIUrl":null,"url":null,"abstract":"<div><div>The management of municipal solid waste (MSW) is one of the primary challenges in urban areas. To improve the accuracy of waste generation predictions, this study employed a hybrid approach that integrates Mutual Information (MI) with Shapley Additive Explanations (SHAP) for effective feature selection in Artificial Intelligence (AI) modeling. The Feed Forward Neural Network (FFNN) and Long Short-Term Memory (LSTM) models were utilized. The FFNN, a shallow learning model, is simpler and effective for capturing general patterns in data, while the LSTM, a deep learning model, is more suitable for autoregressive tasks such as predicting MSW generation. The proposed hybrid approach facilitated more precise identification of the key factors influencing MSW generation and improved the prediction models. The methodology was applied to meteorological and socio-economic data from three cities: Austin in the United States, Ballarat in Australia, and Boralesgamuwa in Sri Lanka, to examine the methodology under different conditions. The dominant factors identified included population, income, the Consumer Price Index (CPI), and lagged MSW variables with lags of 5, 10, and 20 days. The modeling performance was evaluated using the Determination Coefficient (DC) and Root Mean Square Error (RMSE). In Austin, the FFNN achieved a DC of 0.7226 during training and 0.6529 during testing. In Ballarat, the FFNN achieved training and testing DC values of 0.7037 and 0.6941, respectively. In Boralesgamuwa, due to severe data limitations, the model did not train well and showed poor performance in predictions (DC and RMSE values were significantly lower). The better performance of the model in Austin could be attributed to the longer temporal coverage of the data and greater stability in socio-economic patterns, while higher variability in socio-economic factors in Ballarat may have slightly reduced the model’s accuracy. The results from Boralesgamuwa also highlight the importance of access to quality and consistent data for developing accurate models. These findings demonstrate that the MI-SHAP method can enhance prediction accuracy by identifying both linear and nonlinear relationships among variables and provide deeper insights into the dynamics governing waste generation. This methodology can aid in developing sustainable MSW management policies across various regions.</div></div>","PeriodicalId":23969,"journal":{"name":"Waste management","volume":"205 ","pages":"Article 115012"},"PeriodicalIF":7.1000,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Waste management","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0956053X25004234","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

The management of municipal solid waste (MSW) is one of the primary challenges in urban areas. To improve the accuracy of waste generation predictions, this study employed a hybrid approach that integrates Mutual Information (MI) with Shapley Additive Explanations (SHAP) for effective feature selection in Artificial Intelligence (AI) modeling. The Feed Forward Neural Network (FFNN) and Long Short-Term Memory (LSTM) models were utilized. The FFNN, a shallow learning model, is simpler and effective for capturing general patterns in data, while the LSTM, a deep learning model, is more suitable for autoregressive tasks such as predicting MSW generation. The proposed hybrid approach facilitated more precise identification of the key factors influencing MSW generation and improved the prediction models. The methodology was applied to meteorological and socio-economic data from three cities: Austin in the United States, Ballarat in Australia, and Boralesgamuwa in Sri Lanka, to examine the methodology under different conditions. The dominant factors identified included population, income, the Consumer Price Index (CPI), and lagged MSW variables with lags of 5, 10, and 20 days. The modeling performance was evaluated using the Determination Coefficient (DC) and Root Mean Square Error (RMSE). In Austin, the FFNN achieved a DC of 0.7226 during training and 0.6529 during testing. In Ballarat, the FFNN achieved training and testing DC values of 0.7037 and 0.6941, respectively. In Boralesgamuwa, due to severe data limitations, the model did not train well and showed poor performance in predictions (DC and RMSE values were significantly lower). The better performance of the model in Austin could be attributed to the longer temporal coverage of the data and greater stability in socio-economic patterns, while higher variability in socio-economic factors in Ballarat may have slightly reduced the model’s accuracy. The results from Boralesgamuwa also highlight the importance of access to quality and consistent data for developing accurate models. These findings demonstrate that the MI-SHAP method can enhance prediction accuracy by identifying both linear and nonlinear relationships among variables and provide deeper insights into the dynamics governing waste generation. This methodology can aid in developing sustainable MSW management policies across various regions.
利用人工智能预测城市生活垃圾的产生:一种熵分析和SHAP的混合方法,用于最优特征选择
城市固体废物的管理是城市面临的主要挑战之一。为了提高废物产生预测的准确性,本研究采用了一种混合方法,将互信息(MI)与Shapley加性解释(SHAP)相结合,在人工智能(AI)建模中进行有效的特征选择。采用前馈神经网络(FFNN)和长短期记忆(LSTM)模型。FFNN是一种浅层学习模型,对于捕获数据中的一般模式更简单有效,而LSTM是一种深度学习模型,更适合于预测MSW生成等自回归任务。提出的混合方法有助于更精确地识别影响城市生活垃圾产生的关键因素,并改进预测模型。该方法被应用于来自三个城市的气象和社会经济数据:美国的奥斯汀、澳大利亚的巴拉瑞特和斯里兰卡的博拉莱斯加穆瓦,以检验不同条件下的方法。确定的主要因素包括人口、收入、消费者价格指数(CPI)和滞后的城市生活垃圾变量(滞后5天、10天和20天)。使用决定系数(DC)和均方根误差(RMSE)评估建模性能。在Austin, FFNN在训练期间的DC为0.7226,在测试期间的DC为0.6529。在Ballarat, FFNN的训练和测试DC值分别为0.7037和0.6941。在Boralesgamuwa中,由于严重的数据限制,模型训练效果不佳,预测性能较差(DC和RMSE值明显较低)。该模型在奥斯汀的较好表现可归因于数据的较长时间覆盖范围和社会经济模式的更大稳定性,而在巴拉瑞特,社会经济因素的较高可变性可能略微降低了模型的准确性。Boralesgamuwa的研究结果还强调了获取高质量和一致的数据对于开发准确模型的重要性。这些发现表明,MI-SHAP方法可以通过识别变量之间的线性和非线性关系来提高预测精度,并为控制废物产生的动力学提供更深入的见解。这种方法有助于在不同地区制定可持续的城市固体废物管理政策。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Waste management
Waste management 环境科学-工程:环境
CiteScore
15.60
自引率
6.20%
发文量
492
审稿时长
39 days
期刊介绍: Waste Management is devoted to the presentation and discussion of information on solid wastes,it covers the entire lifecycle of solid. wastes. Scope: Addresses solid wastes in both industrialized and economically developing countries Covers various types of solid wastes, including: Municipal (e.g., residential, institutional, commercial, light industrial) Agricultural Special (e.g., C and D, healthcare, household hazardous wastes, sewage sludge)
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信