评价随机森林和多元线性回归对高观测PM10浓度的性能

Nurhafizah Ahmad, A. Z. Ul-Saufie, W. N. Shaziayani, Aida Wati Zainan Abidin, Nur Elis Sharmila Zulazm, S. Harb
{"title":"评价随机森林和多元线性回归对高观测PM10浓度的性能","authors":"Nurhafizah Ahmad, A. Z. Ul-Saufie, W. N. Shaziayani, Aida Wati Zainan Abidin, Nur Elis Sharmila Zulazm, S. Harb","doi":"10.52865/whpm9019","DOIUrl":null,"url":null,"abstract":"Background: Air pollution is notable for its direct impact on human health. Hence, the ability to accurately predict air pollution concentrations is vital to raising public awareness of this issue and for better understanding of air quality management. Aim: Therefore, the aim of this research is to predict PM10 concentrations in Malaysia, specifically on Langkawi Island using random forest and multiple linear regression. Method: The predictive analytics were based on air pollution hourly data from 2003 until 2017. The eight parameters chosen in this study were PM10, NO2, O3, CO, SO2, Relative Humidity (RH), Temperature (T), and Wind Speed (WS). The findings revealed that PM10, SO2, NO2, CO, and O3 hourly trends at Langkawi Island were below the recommended Malaysian Ambient Air Quality Guidelines (MAAQG) standard. Multiple linear regression (MLR) and random forest (RF) were used for modelling and compared based on their prediction accuracy. Result: The values of RMSE, NAE, IA, PA and R2 for MLR were 8.0698, 0.1368, 0.8584, 0.7737 and 0.5984 respectively while the values of RMSE, NAE, IA, PA and R2 for RF were 6.674038, 0.107664, 0.911974, 0.852570 and 0.726681 correspondingly. From the results, the RF method was chosen as a better model than MLR since both; the error measures and the accuracy measures results are close to 1. Nevertheless, the PM10 models (RF and MLR) are unable to take into account the higher observed concentrations.","PeriodicalId":223912,"journal":{"name":"Israa University Journal for Applied Science","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Evaluating the Performance of Random Forest and Multiple Linear Regression for Higher Observed PM10 Concentrations\",\"authors\":\"Nurhafizah Ahmad, A. Z. Ul-Saufie, W. N. Shaziayani, Aida Wati Zainan Abidin, Nur Elis Sharmila Zulazm, S. Harb\",\"doi\":\"10.52865/whpm9019\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Air pollution is notable for its direct impact on human health. Hence, the ability to accurately predict air pollution concentrations is vital to raising public awareness of this issue and for better understanding of air quality management. Aim: Therefore, the aim of this research is to predict PM10 concentrations in Malaysia, specifically on Langkawi Island using random forest and multiple linear regression. Method: The predictive analytics were based on air pollution hourly data from 2003 until 2017. The eight parameters chosen in this study were PM10, NO2, O3, CO, SO2, Relative Humidity (RH), Temperature (T), and Wind Speed (WS). The findings revealed that PM10, SO2, NO2, CO, and O3 hourly trends at Langkawi Island were below the recommended Malaysian Ambient Air Quality Guidelines (MAAQG) standard. Multiple linear regression (MLR) and random forest (RF) were used for modelling and compared based on their prediction accuracy. Result: The values of RMSE, NAE, IA, PA and R2 for MLR were 8.0698, 0.1368, 0.8584, 0.7737 and 0.5984 respectively while the values of RMSE, NAE, IA, PA and R2 for RF were 6.674038, 0.107664, 0.911974, 0.852570 and 0.726681 correspondingly. From the results, the RF method was chosen as a better model than MLR since both; the error measures and the accuracy measures results are close to 1. Nevertheless, the PM10 models (RF and MLR) are unable to take into account the higher observed concentrations.\",\"PeriodicalId\":223912,\"journal\":{\"name\":\"Israa University Journal for Applied Science\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Israa University Journal for Applied Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.52865/whpm9019\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Israa University Journal for Applied Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.52865/whpm9019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

背景:空气污染因其对人类健康的直接影响而引人注目。因此,准确预测空气污染浓度的能力对于提高公众对这一问题的认识和更好地了解空气质量管理至关重要。目的:因此,本研究的目的是预测PM10浓度在马来西亚,特别是在兰卡威岛使用随机森林和多元线性回归。方法:基于2003 - 2017年的逐小时空气污染数据进行预测分析。本研究选取的8个参数为PM10、NO2、O3、CO、SO2、相对湿度(RH)、温度(T)和风速(WS)。调查结果显示,浮罗交怡岛的PM10、SO2、NO2、CO和O3每小时的趋势低于马来西亚环境空气质量指南(MAAQG)的推荐标准。采用多元线性回归(MLR)和随机森林(RF)进行建模,并对其预测精度进行比较。结果:MLR的RMSE、NAE、IA、PA和R2分别为8.0698、0.1368、0.8584、0.7737和0.5984,RF的RMSE、NAE、IA、PA和R2分别为6.674038、0.107664、0.911974、0.852570和0.726681。从结果来看,选择射频方法作为比MLR更好的模型,因为两者;误差测量和精度测量结果接近于1。然而,PM10模型(RF和MLR)无法考虑到更高的观测浓度。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Evaluating the Performance of Random Forest and Multiple Linear Regression for Higher Observed PM10 Concentrations
Background: Air pollution is notable for its direct impact on human health. Hence, the ability to accurately predict air pollution concentrations is vital to raising public awareness of this issue and for better understanding of air quality management. Aim: Therefore, the aim of this research is to predict PM10 concentrations in Malaysia, specifically on Langkawi Island using random forest and multiple linear regression. Method: The predictive analytics were based on air pollution hourly data from 2003 until 2017. The eight parameters chosen in this study were PM10, NO2, O3, CO, SO2, Relative Humidity (RH), Temperature (T), and Wind Speed (WS). The findings revealed that PM10, SO2, NO2, CO, and O3 hourly trends at Langkawi Island were below the recommended Malaysian Ambient Air Quality Guidelines (MAAQG) standard. Multiple linear regression (MLR) and random forest (RF) were used for modelling and compared based on their prediction accuracy. Result: The values of RMSE, NAE, IA, PA and R2 for MLR were 8.0698, 0.1368, 0.8584, 0.7737 and 0.5984 respectively while the values of RMSE, NAE, IA, PA and R2 for RF were 6.674038, 0.107664, 0.911974, 0.852570 and 0.726681 correspondingly. From the results, the RF method was chosen as a better model than MLR since both; the error measures and the accuracy measures results are close to 1. Nevertheless, the PM10 models (RF and MLR) are unable to take into account the higher observed concentrations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信