{"title":"基于深度神经网络的环境污染有效建模的统计推断与分析","authors":"Chilukuri Lakshmi Sravani, S. Miriyala, K. Mitra","doi":"10.1109/ICC56513.2022.10093411","DOIUrl":null,"url":null,"abstract":"Rapid development, due to industrialization and urbanization happening worldwide, has become a prominent cause of air pollution. In such a situation, it is important to create an air quality prediction model development methodology, which not only models the data but also provides inferences understandable to the policymakers. Therefore, in this research, a new methodology has been proposed, where the prediction model is created by combining the concepts of Statistical Inferencing and Deep Learning [Gated Recurring Units (GRU)]. Hourly air pollutants concentration and meteorological data with 14 features measured over one year from 25 different monitoring stations in Northern Taiwan are considered as the dataset. Using methodologies such as Analysis of Variance, Tukey Honestly Significant Difference, Graph theory, and Chi-Square analysis, the voluminous dataset is first clustered based on geographical correlations, and for each cluster, the most significant features responsible for modulating Particulate Matter (PM10) concentrations are identified. Subsequently, the new datasets obtained through the statistical study are used to train the GRU model for final predictions. The proposed model has exhibited an overall accuracy between 90.4% to 99.2% for all clusters. The generic nature of the proposed methodology allows for its extension to predict the transient behaviour of other pollutants across different geographical locations.","PeriodicalId":101654,"journal":{"name":"2022 Eighth Indian Control Conference (ICC)","volume":"57 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Statistical Inference and Analysis for Efficient Modeling of Environmental Pollution using Deep Neural Networks\",\"authors\":\"Chilukuri Lakshmi Sravani, S. Miriyala, K. Mitra\",\"doi\":\"10.1109/ICC56513.2022.10093411\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Rapid development, due to industrialization and urbanization happening worldwide, has become a prominent cause of air pollution. In such a situation, it is important to create an air quality prediction model development methodology, which not only models the data but also provides inferences understandable to the policymakers. Therefore, in this research, a new methodology has been proposed, where the prediction model is created by combining the concepts of Statistical Inferencing and Deep Learning [Gated Recurring Units (GRU)]. Hourly air pollutants concentration and meteorological data with 14 features measured over one year from 25 different monitoring stations in Northern Taiwan are considered as the dataset. Using methodologies such as Analysis of Variance, Tukey Honestly Significant Difference, Graph theory, and Chi-Square analysis, the voluminous dataset is first clustered based on geographical correlations, and for each cluster, the most significant features responsible for modulating Particulate Matter (PM10) concentrations are identified. Subsequently, the new datasets obtained through the statistical study are used to train the GRU model for final predictions. The proposed model has exhibited an overall accuracy between 90.4% to 99.2% for all clusters. The generic nature of the proposed methodology allows for its extension to predict the transient behaviour of other pollutants across different geographical locations.\",\"PeriodicalId\":101654,\"journal\":{\"name\":\"2022 Eighth Indian Control Conference (ICC)\",\"volume\":\"57 2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 Eighth Indian Control Conference (ICC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICC56513.2022.10093411\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Eighth Indian Control Conference (ICC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICC56513.2022.10093411","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Statistical Inference and Analysis for Efficient Modeling of Environmental Pollution using Deep Neural Networks
Rapid development, due to industrialization and urbanization happening worldwide, has become a prominent cause of air pollution. In such a situation, it is important to create an air quality prediction model development methodology, which not only models the data but also provides inferences understandable to the policymakers. Therefore, in this research, a new methodology has been proposed, where the prediction model is created by combining the concepts of Statistical Inferencing and Deep Learning [Gated Recurring Units (GRU)]. Hourly air pollutants concentration and meteorological data with 14 features measured over one year from 25 different monitoring stations in Northern Taiwan are considered as the dataset. Using methodologies such as Analysis of Variance, Tukey Honestly Significant Difference, Graph theory, and Chi-Square analysis, the voluminous dataset is first clustered based on geographical correlations, and for each cluster, the most significant features responsible for modulating Particulate Matter (PM10) concentrations are identified. Subsequently, the new datasets obtained through the statistical study are used to train the GRU model for final predictions. The proposed model has exhibited an overall accuracy between 90.4% to 99.2% for all clusters. The generic nature of the proposed methodology allows for its extension to predict the transient behaviour of other pollutants across different geographical locations.