Predictive factors for allergy at 4–6 years of age based on machine learning: A pilot study

IF 2.4 Q3 NUTRITION & DIETETICS
Kim Kamphorst , Alejandro Lopez-Rincon , Arine M. Vlieger , Johan Garssen , Esther van ’t Riet , Ruurd M. van Elburg
{"title":"Predictive factors for allergy at 4–6 years of age based on machine learning: A pilot study","authors":"Kim Kamphorst ,&nbsp;Alejandro Lopez-Rincon ,&nbsp;Arine M. Vlieger ,&nbsp;Johan Garssen ,&nbsp;Esther van ’t Riet ,&nbsp;Ruurd M. van Elburg","doi":"10.1016/j.phanu.2022.100326","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>In Europe, allergic diseases are the most common chronic childhood illnesses and the result of a complex interplay between genetics and environmental factors. A new approach for analyzing this complex data is to employ machine learning (ML) algorithms. Therefore, the aim of this pilot study was to find predictors for the presence of parental-reported allergy at 4–6 years of age by using feature selection in ML.</p></div><div><h3>Methods</h3><p>A recursive ensemble feature selection (REFS) was used, with a 20% step reduction and with eight different classifiers in the ensemble, and resampling given the class unbalance. Thereafter, the Receiver Operating Characteristic Curves for five different classifiers, not included in the original ensemble feature selection technique, were calculated.</p></div><div><h3>Results</h3><p>In total, 130 children (14 with and 116 without parental-reported allergy) and 248 features were included in the ML analyses. The REFS algorithm showed a result of 20 features and particularly, the Multi-layer Perceptron Classifier had an area under the curve (AUC) of 0.86 (SD 0.08). The features predictive for allergy were: tobacco exposure during pregnancy, atopic parents, gestational age, days of: diarrhea, cough, rash, and fever during first year of life, ever being exposed to antibiotics, Resistin, IL-27, MMP9, CXCL8, CCL13, Vimentin, IL-4, CCL22, GAL1, IL-6, LIGHT, and GMCSF.</p></div><div><h3>Conclusions</h3><p>This ML model shows that a combination of environmental exposures and cytokines can predict later allergy with an AUC of 0.86 despite the small sample size. In the future, our ML model still needs to be externally validated.</p></div>","PeriodicalId":20049,"journal":{"name":"PharmaNutrition","volume":"23 ","pages":"Article 100326"},"PeriodicalIF":2.4000,"publicationDate":"2023-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PharmaNutrition","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213434422000391","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"NUTRITION & DIETETICS","Score":null,"Total":0}
引用次数: 0

Abstract

Background

In Europe, allergic diseases are the most common chronic childhood illnesses and the result of a complex interplay between genetics and environmental factors. A new approach for analyzing this complex data is to employ machine learning (ML) algorithms. Therefore, the aim of this pilot study was to find predictors for the presence of parental-reported allergy at 4–6 years of age by using feature selection in ML.

Methods

A recursive ensemble feature selection (REFS) was used, with a 20% step reduction and with eight different classifiers in the ensemble, and resampling given the class unbalance. Thereafter, the Receiver Operating Characteristic Curves for five different classifiers, not included in the original ensemble feature selection technique, were calculated.

Results

In total, 130 children (14 with and 116 without parental-reported allergy) and 248 features were included in the ML analyses. The REFS algorithm showed a result of 20 features and particularly, the Multi-layer Perceptron Classifier had an area under the curve (AUC) of 0.86 (SD 0.08). The features predictive for allergy were: tobacco exposure during pregnancy, atopic parents, gestational age, days of: diarrhea, cough, rash, and fever during first year of life, ever being exposed to antibiotics, Resistin, IL-27, MMP9, CXCL8, CCL13, Vimentin, IL-4, CCL22, GAL1, IL-6, LIGHT, and GMCSF.

Conclusions

This ML model shows that a combination of environmental exposures and cytokines can predict later allergy with an AUC of 0.86 despite the small sample size. In the future, our ML model still needs to be externally validated.

基于机器学习的4-6岁过敏预测因素:一项试点研究
背景在欧洲,过敏性疾病是最常见的儿童慢性疾病,是遗传和环境因素之间复杂相互作用的结果。分析这种复杂数据的一种新方法是使用机器学习(ML)算法。因此,这项试点研究的目的是通过在ML中使用特征选择来寻找4-6岁时父母报告过敏的预测因素。方法使用递归集合特征选择(REFS),步长减少20%,集合中有八个不同的分类器,并在类别不平衡的情况下重新采样。此后,计算了原始集成特征选择技术中不包括的五个不同分类器的接收器操作特性曲线。结果共有130名儿童(14名父母报告过敏,116名未报告过敏)和248个特征被纳入ML分析。REFS算法显示了20个特征的结果,特别是多层感知器分类器的曲线下面积(AUC)为0.86(SD 0.08)。预测过敏的特征是:怀孕期间接触烟草,特应性父母,胎龄,第一年腹泻、咳嗽、皮疹和发烧的天数,曾接触过抗生素、抵抗素、IL-27、MMP9,CXCL8、CCL13、波形蛋白、IL-4、CCL22、GAL1、IL-6、LIGHT和GMCSF。结论该ML模型表明,尽管样本量较小,但环境暴露和细胞因子的组合可以预测后期过敏,AUC为0.86。在未来,我们的ML模型仍然需要外部验证。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
PharmaNutrition
PharmaNutrition Agricultural and Biological Sciences-Food Science
CiteScore
5.70
自引率
3.10%
发文量
33
审稿时长
12 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信