Applying random forest to forecast municipal solid waste generation from household fuel consumption

IF 6.4 Q1 ENVIRONMENTAL SCIENCES
Luis Izquierdo-Horna , Ramzy Kahhat , Ian Vázquez-Rowe
{"title":"Applying random forest to forecast municipal solid waste generation from household fuel consumption","authors":"Luis Izquierdo-Horna ,&nbsp;Ramzy Kahhat ,&nbsp;Ian Vázquez-Rowe","doi":"10.1016/j.rcradv.2025.200264","DOIUrl":null,"url":null,"abstract":"<div><div>Accurately forecasting municipal solid waste (MSW) generation is essential for designing efficient waste management systems and promoting sustainable urban development. As cities expand and consumption patterns shift, reliable data-driven approaches are increasingly necessary to address the complexities of MSW generation. This study applied the random forest (RF) algorithm, a machine learning technique, to predict MSW generation at the household level. RF was selected for its capacity to handle non-linear relationships, imbalanced datasets, and outliers. The analysis focused on data from 2019, avoiding distortions associated with the COVID-19 pandemic. The model integrated per capita MSW data with household fuel consumption indicators (i.e., natural gas, electricity, and liquefied petroleum gas) and demographic variables such as age, education level, and monthly expenditure. The case study focused on the city of Lima, Peru, using 80 % of the data for training and 20 % for testing, with hyperparameters optimized via 5-fold cross-validation. The final model explained 55 % of the variance in MSW generation (R² = 0.55). This result reflects the model’s ability to capture significant drivers of variability, although it leaves room for refinement due to factors not included in the analysis, such as cultural practices or seasonality. Among the predictors, household monthly expenditure on cooking fuels emerged as the most influential variable, reinforcing the connection between resource consumption and waste generation. These findings highlight the potential of integrating socioeconomic indicators into predictive models to enhance their reliability. By improving forecasting capabilities, this study supports targeted policies for urban waste management and sustainable resource use.</div></div>","PeriodicalId":74689,"journal":{"name":"Resources, conservation & recycling advances","volume":"27 ","pages":"Article 200264"},"PeriodicalIF":6.4000,"publicationDate":"2025-06-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Resources, conservation & recycling advances","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667378925000227","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Accurately forecasting municipal solid waste (MSW) generation is essential for designing efficient waste management systems and promoting sustainable urban development. As cities expand and consumption patterns shift, reliable data-driven approaches are increasingly necessary to address the complexities of MSW generation. This study applied the random forest (RF) algorithm, a machine learning technique, to predict MSW generation at the household level. RF was selected for its capacity to handle non-linear relationships, imbalanced datasets, and outliers. The analysis focused on data from 2019, avoiding distortions associated with the COVID-19 pandemic. The model integrated per capita MSW data with household fuel consumption indicators (i.e., natural gas, electricity, and liquefied petroleum gas) and demographic variables such as age, education level, and monthly expenditure. The case study focused on the city of Lima, Peru, using 80 % of the data for training and 20 % for testing, with hyperparameters optimized via 5-fold cross-validation. The final model explained 55 % of the variance in MSW generation (R² = 0.55). This result reflects the model’s ability to capture significant drivers of variability, although it leaves room for refinement due to factors not included in the analysis, such as cultural practices or seasonality. Among the predictors, household monthly expenditure on cooking fuels emerged as the most influential variable, reinforcing the connection between resource consumption and waste generation. These findings highlight the potential of integrating socioeconomic indicators into predictive models to enhance their reliability. By improving forecasting capabilities, this study supports targeted policies for urban waste management and sustainable resource use.

Abstract Image

应用随机森林预测家庭燃料消费产生的城市固体废物
准确预测城市固体废物的产生对于设计有效的废物管理系统和促进城市可持续发展至关重要。随着城市的扩张和消费模式的转变,越来越需要可靠的数据驱动方法来解决城市生活垃圾产生的复杂性。本研究应用随机森林(RF)算法(一种机器学习技术)来预测家庭生活垃圾的产生。选择射频是因为其处理非线性关系、不平衡数据集和异常值的能力。该分析侧重于2019年的数据,避免了与COVID-19大流行相关的扭曲。该模型将人均城市生活垃圾数据与家庭燃料消耗指标(即天然气、电力和液化石油气)以及年龄、教育水平和月支出等人口统计变量相结合。该案例研究集中在秘鲁利马市,使用80%的数据用于训练,20%用于测试,并通过5倍交叉验证优化了超参数。最终模型解释了55%的城市生活垃圾产生方差(R²= 0.55)。这一结果反映了模型捕捉可变性的重要驱动因素的能力,尽管由于分析中未包括的因素(如文化习俗或季节性),它留下了改进的空间。在预测指标中,家庭每月烹饪燃料支出成为影响最大的变量,加强了资源消耗与废物产生之间的联系。这些发现突出了将社会经济指标纳入预测模型以提高其可靠性的潜力。通过提高预测能力,本研究支持有针对性的城市废物管理和可持续资源利用政策。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Resources, conservation & recycling advances
Resources, conservation & recycling advances Environmental Science (General)
CiteScore
11.70
自引率
0.00%
发文量
0
审稿时长
76 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信