Kim Kamphorst , Alejandro Lopez-Rincon , Arine M. Vlieger , Johan Garssen , Esther van ’t Riet , Ruurd M. van Elburg
{"title":"Predictive factors for allergy at 4–6 years of age based on machine learning: A pilot study","authors":"Kim Kamphorst , Alejandro Lopez-Rincon , Arine M. Vlieger , Johan Garssen , Esther van ’t Riet , Ruurd M. van Elburg","doi":"10.1016/j.phanu.2022.100326","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>In Europe, allergic diseases are the most common chronic childhood illnesses and the result of a complex interplay between genetics and environmental factors. A new approach for analyzing this complex data is to employ machine learning (ML) algorithms. Therefore, the aim of this pilot study was to find predictors for the presence of parental-reported allergy at 4–6 years of age by using feature selection in ML.</p></div><div><h3>Methods</h3><p>A recursive ensemble feature selection (REFS) was used, with a 20% step reduction and with eight different classifiers in the ensemble, and resampling given the class unbalance. Thereafter, the Receiver Operating Characteristic Curves for five different classifiers, not included in the original ensemble feature selection technique, were calculated.</p></div><div><h3>Results</h3><p>In total, 130 children (14 with and 116 without parental-reported allergy) and 248 features were included in the ML analyses. The REFS algorithm showed a result of 20 features and particularly, the Multi-layer Perceptron Classifier had an area under the curve (AUC) of 0.86 (SD 0.08). The features predictive for allergy were: tobacco exposure during pregnancy, atopic parents, gestational age, days of: diarrhea, cough, rash, and fever during first year of life, ever being exposed to antibiotics, Resistin, IL-27, MMP9, CXCL8, CCL13, Vimentin, IL-4, CCL22, GAL1, IL-6, LIGHT, and GMCSF.</p></div><div><h3>Conclusions</h3><p>This ML model shows that a combination of environmental exposures and cytokines can predict later allergy with an AUC of 0.86 despite the small sample size. In the future, our ML model still needs to be externally validated.</p></div>","PeriodicalId":20049,"journal":{"name":"PharmaNutrition","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PharmaNutrition","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213434422000391","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"NUTRITION & DIETETICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background
In Europe, allergic diseases are the most common chronic childhood illnesses and the result of a complex interplay between genetics and environmental factors. A new approach for analyzing this complex data is to employ machine learning (ML) algorithms. Therefore, the aim of this pilot study was to find predictors for the presence of parental-reported allergy at 4–6 years of age by using feature selection in ML.
Methods
A recursive ensemble feature selection (REFS) was used, with a 20% step reduction and with eight different classifiers in the ensemble, and resampling given the class unbalance. Thereafter, the Receiver Operating Characteristic Curves for five different classifiers, not included in the original ensemble feature selection technique, were calculated.
Results
In total, 130 children (14 with and 116 without parental-reported allergy) and 248 features were included in the ML analyses. The REFS algorithm showed a result of 20 features and particularly, the Multi-layer Perceptron Classifier had an area under the curve (AUC) of 0.86 (SD 0.08). The features predictive for allergy were: tobacco exposure during pregnancy, atopic parents, gestational age, days of: diarrhea, cough, rash, and fever during first year of life, ever being exposed to antibiotics, Resistin, IL-27, MMP9, CXCL8, CCL13, Vimentin, IL-4, CCL22, GAL1, IL-6, LIGHT, and GMCSF.
Conclusions
This ML model shows that a combination of environmental exposures and cytokines can predict later allergy with an AUC of 0.86 despite the small sample size. In the future, our ML model still needs to be externally validated.