{"title":"The Application of Machine Learning Models in the Prediction of PM2.5/PM10 Concentration","authors":"Xinzhi Lin","doi":"10.1145/3450588.3450605","DOIUrl":null,"url":null,"abstract":"The current world economy and science are in an era of rapid development, and Beijing is experiencing chronic air pollution. The air quality is important to the travel of people, development of enterprise and normal operation of traffic. PM2.5 and PM10 are the main components which cause the air pollution, and it's very meaningful to predict their concentration in the air [1]. Although some traditional models (like basic linear regression) have been proposed to predict the content of PM2.5/PM10, the quantities of variables included to predict the concentration are few and it executes with low efficiency and low accuracy. In the big data era, it's necessary to build the model which can execute the big data kinds and sets. With the adequate data sets from different meteorological stations in Beijing, we can use the more abundant variables such as mass of SO2, NO2, wind direction and other weather observations to predict the content of PM2.5/PM10. We build the machine learning models with higher efficiency, accuracy and stronger learning ability, whose primary algorithms include: multiple linear regression, decision tree, boosting and random forest based on decision tree and neural network. The result demonstrates that the prediction effect of the models is based on neural network and ensemble learning. Boosting performs best among these models, which achieves R-square 84.2% and 75.7% on the test set for the PM2.5 and PM10, respectively.","PeriodicalId":150426,"journal":{"name":"Proceedings of the 2021 4th International Conference on Computers in Management and Business","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 4th International Conference on Computers in Management and Business","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3450588.3450605","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
The current world economy and science are in an era of rapid development, and Beijing is experiencing chronic air pollution. The air quality is important to the travel of people, development of enterprise and normal operation of traffic. PM2.5 and PM10 are the main components which cause the air pollution, and it's very meaningful to predict their concentration in the air [1]. Although some traditional models (like basic linear regression) have been proposed to predict the content of PM2.5/PM10, the quantities of variables included to predict the concentration are few and it executes with low efficiency and low accuracy. In the big data era, it's necessary to build the model which can execute the big data kinds and sets. With the adequate data sets from different meteorological stations in Beijing, we can use the more abundant variables such as mass of SO2, NO2, wind direction and other weather observations to predict the content of PM2.5/PM10. We build the machine learning models with higher efficiency, accuracy and stronger learning ability, whose primary algorithms include: multiple linear regression, decision tree, boosting and random forest based on decision tree and neural network. The result demonstrates that the prediction effect of the models is based on neural network and ensemble learning. Boosting performs best among these models, which achieves R-square 84.2% and 75.7% on the test set for the PM2.5 and PM10, respectively.