{"title":"Using Logistic Regression with Time-Stratified Method for Air Pollution Datasets Forecasting","authors":"S. Mohammad, O. Hannon","doi":"10.33899/IQJOSS.2020.165444","DOIUrl":null,"url":null,"abstract":"< Particular matter (PM10) studying and forecasting is necessary to control and reduce the damage of environment and human health. There are many pollutants as sources of air pollution may effect on PM10 variable. Studied datasets have been taken from the Kuala Lumpur meteorological station, Malaysia. Logistic regression (LR) is built by using generalized linear model as a special case of linear statistical methods, therefore it may reflect inaccurate results when used with nonlinear datasets. Time stratified (TS) method in different styles is proposed for satisfying more homogeneity of datasets. It includes ordering similar seasons in different years together to formulate new variable smoother than their original. The results of LR model in this study reflect outperforming for time stratified datasets comparing to full dataset. In conclusion, LR forecasting can be depended after datasets time stratifying to satisfy more accuracy with nonlinear multivariate datasets in which PM10 is to dependent variable.","PeriodicalId":351789,"journal":{"name":"IRAQI JOURNAL OF STATISTICAL SCIENCES","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IRAQI JOURNAL OF STATISTICAL SCIENCES","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33899/IQJOSS.2020.165444","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
< Particular matter (PM10) studying and forecasting is necessary to control and reduce the damage of environment and human health. There are many pollutants as sources of air pollution may effect on PM10 variable. Studied datasets have been taken from the Kuala Lumpur meteorological station, Malaysia. Logistic regression (LR) is built by using generalized linear model as a special case of linear statistical methods, therefore it may reflect inaccurate results when used with nonlinear datasets. Time stratified (TS) method in different styles is proposed for satisfying more homogeneity of datasets. It includes ordering similar seasons in different years together to formulate new variable smoother than their original. The results of LR model in this study reflect outperforming for time stratified datasets comparing to full dataset. In conclusion, LR forecasting can be depended after datasets time stratifying to satisfy more accuracy with nonlinear multivariate datasets in which PM10 is to dependent variable.