Elprida Agustina , Emenda Sembiring , Anjar Dimara Sakti , Like Hana Fournida Purba
{"title":"Development of plastic waste generation distribution model using remote sensing data product and machine learning","authors":"Elprida Agustina , Emenda Sembiring , Anjar Dimara Sakti , Like Hana Fournida Purba","doi":"10.1016/j.clwas.2025.100324","DOIUrl":null,"url":null,"abstract":"<div><div>This research aims to map the distribution of plastic waste generation at the household level to establish baseline data for plastic waste management. The study focuses on 905,935 households in Bali Province. Variables related to household characteristics were gathered from historical studies, including house area size, population density, area characteristics based on rural or urban designations, and economic status in specific coordinates. The remote sensing data products and their corresponding variables used in this study included: VIIRS (Visible Infrared Imaging Radiometer Suite) Night-time Day/Night data representing economic status, WorldPop Global Project Population Data representing population density, and Impervious data representing urban/rural classification. 200 primary sampling data points on plastic waste generation at households, coordinates, and house area sizes were collected. Linear and nonlinear regression machine learning algorithms were performed, with plastic waste generation as the dependent variable and the extracted remote sensing data products and house size as independent variables. The best-performing model was the non-linear regression model LGBM (Light Gradient Boosting Machine), achieving an R² score of 0.882, RMSE (Root Mean Squared Error) of 18.374, and MAPE (Mean Absolute Percentage Error) of 12.877 on testing data. The modeling results indicated that the feature importance of each variable, in order, was economic status, population density, house size, and urban or rural area classification.</div></div>","PeriodicalId":100256,"journal":{"name":"Cleaner Waste Systems","volume":"11 ","pages":"Article 100324"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cleaner Waste Systems","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772912525001228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This research aims to map the distribution of plastic waste generation at the household level to establish baseline data for plastic waste management. The study focuses on 905,935 households in Bali Province. Variables related to household characteristics were gathered from historical studies, including house area size, population density, area characteristics based on rural or urban designations, and economic status in specific coordinates. The remote sensing data products and their corresponding variables used in this study included: VIIRS (Visible Infrared Imaging Radiometer Suite) Night-time Day/Night data representing economic status, WorldPop Global Project Population Data representing population density, and Impervious data representing urban/rural classification. 200 primary sampling data points on plastic waste generation at households, coordinates, and house area sizes were collected. Linear and nonlinear regression machine learning algorithms were performed, with plastic waste generation as the dependent variable and the extracted remote sensing data products and house size as independent variables. The best-performing model was the non-linear regression model LGBM (Light Gradient Boosting Machine), achieving an R² score of 0.882, RMSE (Root Mean Squared Error) of 18.374, and MAPE (Mean Absolute Percentage Error) of 12.877 on testing data. The modeling results indicated that the feature importance of each variable, in order, was economic status, population density, house size, and urban or rural area classification.