Development of an Atopic Dermatitis Incidence Rate Prediction Model for South Korea Using Air Pollutants Big Data: Comparisons Between Regression and Artificial Neural Network
Byeonggeuk Lim, Poong-Mo Park, Da-Mee Eun, Dong-Woo Kim, Cheonwoong Kang, Ki-Joon Jeon, SeJoon Park, Jong-Sang Youn
{"title":"Development of an Atopic Dermatitis Incidence Rate Prediction Model for South Korea Using Air Pollutants Big Data: Comparisons Between Regression and Artificial Neural Network","authors":"Byeonggeuk Lim, Poong-Mo Park, Da-Mee Eun, Dong-Woo Kim, Cheonwoong Kang, Ki-Joon Jeon, SeJoon Park, Jong-Sang Youn","doi":"10.1007/s11814-024-00244-9","DOIUrl":null,"url":null,"abstract":"<p>We have developed models to predict the incidence of atopic dermatitis using regression analysis and artificial neural networks (ANN). Initially, the prediction models were created using various inputs, including air pollutants (SO<sub>2</sub>, CO, O<sub>3</sub>, NO<sub>2</sub>, and PM<sub>10</sub>), meteorological factors (temperature, humidity, wind speed, and precipitation), population rates, and clinical data from South Korea, referred to as the average model. Subsequently, we developed models that use sex and age as variables instead of population rates, named the sex and age model. Both sets of models were designed to forecast incidence rates on a nationwide scale (NW), as well as for 16 administrative districts (AD) in South Korea, which includes seven metropolitan areas and nine provinces. We found that SO<sub>2</sub> significantly affected the incidence rate, and the inclusion of regional variables in the AD models helped account for regional variations in incidence rates. The average models generally provided accurate predictions of incidence rates, with SO<sub>2</sub> chosen as the key independent variable in the regression models for the five air pollutants studied. The <i>R</i><sup>2</sup> values for the average models using regression are 0.70 for the NW model and 0.89 for the AD model. Among the ANN-based models, the <i>R</i><sup>2</sup> values are 0.84 for the NW model and 0.90 for the AD model, this indicated a slightly higher predictive accuracy. For the sex and age models, we differentiated between children under 10 years of age and those older. In these models, ANN demonstrated greater accuracy than regression, with <i>R</i><sup>2</sup> values of 0.95, 0.92, 0.96, and 0.92 for the sex and age NW model under 10 years old, sex and age AD model under 10 years old, sex and age NW model over 10 years old, and sex and age AD model over 10 years old, respectively.</p>","PeriodicalId":684,"journal":{"name":"Korean Journal of Chemical Engineering","volume":"77 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Korean Journal of Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11814-024-00244-9","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
We have developed models to predict the incidence of atopic dermatitis using regression analysis and artificial neural networks (ANN). Initially, the prediction models were created using various inputs, including air pollutants (SO2, CO, O3, NO2, and PM10), meteorological factors (temperature, humidity, wind speed, and precipitation), population rates, and clinical data from South Korea, referred to as the average model. Subsequently, we developed models that use sex and age as variables instead of population rates, named the sex and age model. Both sets of models were designed to forecast incidence rates on a nationwide scale (NW), as well as for 16 administrative districts (AD) in South Korea, which includes seven metropolitan areas and nine provinces. We found that SO2 significantly affected the incidence rate, and the inclusion of regional variables in the AD models helped account for regional variations in incidence rates. The average models generally provided accurate predictions of incidence rates, with SO2 chosen as the key independent variable in the regression models for the five air pollutants studied. The R2 values for the average models using regression are 0.70 for the NW model and 0.89 for the AD model. Among the ANN-based models, the R2 values are 0.84 for the NW model and 0.90 for the AD model, this indicated a slightly higher predictive accuracy. For the sex and age models, we differentiated between children under 10 years of age and those older. In these models, ANN demonstrated greater accuracy than regression, with R2 values of 0.95, 0.92, 0.96, and 0.92 for the sex and age NW model under 10 years old, sex and age AD model under 10 years old, sex and age NW model over 10 years old, and sex and age AD model over 10 years old, respectively.
期刊介绍:
The Korean Journal of Chemical Engineering provides a global forum for the dissemination of research in chemical engineering. The Journal publishes significant research results obtained in the Asia-Pacific region, and simultaneously introduces recent technical progress made in other areas of the world to this region. Submitted research papers must be of potential industrial significance and specifically concerned with chemical engineering. The editors will give preference to papers having a clearly stated practical scope and applicability in the areas of chemical engineering, and to those where new theoretical concepts are supported by new experimental details. The Journal also regularly publishes featured reviews on emerging and industrially important subjects of chemical engineering as well as selected papers presented at international conferences on the subjects.