预测空气污染的机器学习技术比较分析

M. U. Ashraf, Farwa Akram, Sardar Usman
{"title":"预测空气污染的机器学习技术比较分析","authors":"M. U. Ashraf, Farwa Akram, Sardar Usman","doi":"10.54692/lgurjcsit.2022.0602270","DOIUrl":null,"url":null,"abstract":"The modern and motorized way of life has cultured air pollution.  Air pollution has become the biggest rival of robust living. This situation is becoming more lethal in developing countries and so in Pakistan.  Hence, this inquiry was carried out to propose an architecture design that could make real-time prediction of air pollution with another purpose of scanning the frequently adopted algorithm in past investigations. In addition, it was also intended to narrate the toxic effects of air pollution on human health. So, this research was carried out on a large dataset of Seoul as an adequate dataset of Pakistan was not attainable. The dataset consisted of three years (2017-2019) including 647,512 instances and 11 attributes. The four distinctive algorithms termed Random Forest, Linear Regression, Decision Tree and XGBoosting were employed. It was inferred that XGB is more promising and feasible in predicting concentration level of NO2, O3, SO2, PM10, PM2.5 and CO with the lowest RMSE and MAE values of 0.0111, 0.0262, 0.0168, 49.64, 41.68 and 0.1856 and 0.0067, 0.0096, 0.0017, 12.28, 7.63 and 0.0982 respectively. Furthermore, it was found out as well that the Random Forest was preferred mostly in the previous studies related to air pollution prophecy while many probes supported that air pollution is very detrimental to human health especially long-lasting exposure causes lung cancer, respiratory and cardiovascular diseases.","PeriodicalId":197260,"journal":{"name":"Lahore Garrison University Research Journal of Computer Science and Information Technology","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Comparative Analysis of Machine Learning Techniques for Predicting Air Pollution\",\"authors\":\"M. U. Ashraf, Farwa Akram, Sardar Usman\",\"doi\":\"10.54692/lgurjcsit.2022.0602270\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The modern and motorized way of life has cultured air pollution.  Air pollution has become the biggest rival of robust living. This situation is becoming more lethal in developing countries and so in Pakistan.  Hence, this inquiry was carried out to propose an architecture design that could make real-time prediction of air pollution with another purpose of scanning the frequently adopted algorithm in past investigations. In addition, it was also intended to narrate the toxic effects of air pollution on human health. So, this research was carried out on a large dataset of Seoul as an adequate dataset of Pakistan was not attainable. The dataset consisted of three years (2017-2019) including 647,512 instances and 11 attributes. The four distinctive algorithms termed Random Forest, Linear Regression, Decision Tree and XGBoosting were employed. It was inferred that XGB is more promising and feasible in predicting concentration level of NO2, O3, SO2, PM10, PM2.5 and CO with the lowest RMSE and MAE values of 0.0111, 0.0262, 0.0168, 49.64, 41.68 and 0.1856 and 0.0067, 0.0096, 0.0017, 12.28, 7.63 and 0.0982 respectively. Furthermore, it was found out as well that the Random Forest was preferred mostly in the previous studies related to air pollution prophecy while many probes supported that air pollution is very detrimental to human health especially long-lasting exposure causes lung cancer, respiratory and cardiovascular diseases.\",\"PeriodicalId\":197260,\"journal\":{\"name\":\"Lahore Garrison University Research Journal of Computer Science and Information Technology\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Lahore Garrison University Research Journal of Computer Science and Information Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.54692/lgurjcsit.2022.0602270\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lahore Garrison University Research Journal of Computer Science and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54692/lgurjcsit.2022.0602270","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

现代机动化的生活方式造成了空气污染。空气污染已经成为健康生活的最大对手。这种情况在发展中国家变得更加致命,在巴基斯坦也是如此。因此,本研究提出了一种架构设计,可以实时预测空气污染,另一个目的是扫描过去调查中经常采用的算法。此外,它还旨在叙述空气污染对人类健康的毒性影响。因此,由于无法获得足够的巴基斯坦数据集,因此本研究是在首尔的大型数据集上进行的。该数据集由三年(2017-2019)组成,包括647,512个实例和11个属性。采用了随机森林、线性回归、决策树和XGBoosting四种不同的算法。结果表明,XGB预测NO2、O3、SO2、PM10、PM2.5和CO浓度水平的RMSE和MAE最低,分别为0.0111、0.0262、0.0168、49.64、41.68和0.1856,0.0067、0.0096、0.0017、12.28、7.63和0.0982。此外,我们还发现,在以往有关空气污染预言的研究中,人们大多倾向于选择随机森林,而许多研究都支持空气污染对人体健康非常有害,特别是长期暴露会导致肺癌、呼吸系统疾病和心血管疾病。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparative Analysis of Machine Learning Techniques for Predicting Air Pollution
The modern and motorized way of life has cultured air pollution.  Air pollution has become the biggest rival of robust living. This situation is becoming more lethal in developing countries and so in Pakistan.  Hence, this inquiry was carried out to propose an architecture design that could make real-time prediction of air pollution with another purpose of scanning the frequently adopted algorithm in past investigations. In addition, it was also intended to narrate the toxic effects of air pollution on human health. So, this research was carried out on a large dataset of Seoul as an adequate dataset of Pakistan was not attainable. The dataset consisted of three years (2017-2019) including 647,512 instances and 11 attributes. The four distinctive algorithms termed Random Forest, Linear Regression, Decision Tree and XGBoosting were employed. It was inferred that XGB is more promising and feasible in predicting concentration level of NO2, O3, SO2, PM10, PM2.5 and CO with the lowest RMSE and MAE values of 0.0111, 0.0262, 0.0168, 49.64, 41.68 and 0.1856 and 0.0067, 0.0096, 0.0017, 12.28, 7.63 and 0.0982 respectively. Furthermore, it was found out as well that the Random Forest was preferred mostly in the previous studies related to air pollution prophecy while many probes supported that air pollution is very detrimental to human health especially long-lasting exposure causes lung cancer, respiratory and cardiovascular diseases.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信