Machine learning-based early prediction of asthma in preschoolers: The COCOA birth cohort study.

IF 4.5
Chang Hoon Han, Seok-Jae Heo, Haerin Jang, So-Yeon Lee, Ji Soo Park, Dong In Suh, Youn Ho Shin, Jihyun Kim, Kangmo Ahn, Myung Hyun Sohn, Eom Ji Choi, Sun Hee Choi, Hey-Sung Baek, Soo-Jong Hong, Kyung Won Kim, Inkyung Jung, Soo Yeon Kim
{"title":"Machine learning-based early prediction of asthma in preschoolers: The COCOA birth cohort study.","authors":"Chang Hoon Han, Seok-Jae Heo, Haerin Jang, So-Yeon Lee, Ji Soo Park, Dong In Suh, Youn Ho Shin, Jihyun Kim, Kangmo Ahn, Myung Hyun Sohn, Eom Ji Choi, Sun Hee Choi, Hey-Sung Baek, Soo-Jong Hong, Kyung Won Kim, Inkyung Jung, Soo Yeon Kim","doi":"10.1111/pai.70223","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Early prediction of asthma in preschoolers, which is crucial for timely intervention, remains challenging. This study aimed to develop a machine learning (ML)-based model and a questionnaire-based scoring tool for the prediction of asthma at age 3 years.</p><p><strong>Methods: </strong>Data from the COhort for Childhood Origin of Asthma and allergic diseases (COCOA), a comprehensive prospective birth cohort in South Korea, was used. Children with complete 3-year follow-up (n = 2007) were divided into development (n = 1472) and validation (n = 535) cohorts based on birth year. Asthma diagnosis at age 3 years was based on physician diagnosis, recurrent wheezing episodes, asthma treatment, or parental reports. Random Forest-based predictive models were developed using data collected until the age of 2 years, initially selecting features via least absolute shrinkage and selection operator (LASSO) regression. A questionnaire-based scoring tool was also developed and compared with multiple ML algorithms.</p><p><strong>Results: </strong>The ML-based prediction models showed improved performance as the data accumulated. The 6-month, 1-year, and 2-year models had area under the receiver operating characteristic curve (AUROC) values of 0.614, 0.726, and 0.774, respectively, in the validation cohort. The performance of the questionnaire-based scoring tool (AUROC, 0.790) was comparable to that of the ML-based model. Important predictors included paternal total IgE levels, maternal iron supplementation during pregnancy, parental asthma history, nut allergy history, and recent lower respiratory infections.</p><p><strong>Conclusions: </strong>Our study successfully developed robust predictive models for early asthma that demonstrated high performance. The questionnaire-based scoring tool offers particular value because of its clinical applicability. Further validation in diverse populations and investigation of the causative pathways of the identified predictors are necessary to enhance clinical utility.</p>","PeriodicalId":520742,"journal":{"name":"Pediatric allergy and immunology : official publication of the European Society of Pediatric Allergy and Immunology","volume":"36 10","pages":"e70223"},"PeriodicalIF":4.5000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12533341/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pediatric allergy and immunology : official publication of the European Society of Pediatric Allergy and Immunology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1111/pai.70223","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Early prediction of asthma in preschoolers, which is crucial for timely intervention, remains challenging. This study aimed to develop a machine learning (ML)-based model and a questionnaire-based scoring tool for the prediction of asthma at age 3 years.

Methods: Data from the COhort for Childhood Origin of Asthma and allergic diseases (COCOA), a comprehensive prospective birth cohort in South Korea, was used. Children with complete 3-year follow-up (n = 2007) were divided into development (n = 1472) and validation (n = 535) cohorts based on birth year. Asthma diagnosis at age 3 years was based on physician diagnosis, recurrent wheezing episodes, asthma treatment, or parental reports. Random Forest-based predictive models were developed using data collected until the age of 2 years, initially selecting features via least absolute shrinkage and selection operator (LASSO) regression. A questionnaire-based scoring tool was also developed and compared with multiple ML algorithms.

Results: The ML-based prediction models showed improved performance as the data accumulated. The 6-month, 1-year, and 2-year models had area under the receiver operating characteristic curve (AUROC) values of 0.614, 0.726, and 0.774, respectively, in the validation cohort. The performance of the questionnaire-based scoring tool (AUROC, 0.790) was comparable to that of the ML-based model. Important predictors included paternal total IgE levels, maternal iron supplementation during pregnancy, parental asthma history, nut allergy history, and recent lower respiratory infections.

Conclusions: Our study successfully developed robust predictive models for early asthma that demonstrated high performance. The questionnaire-based scoring tool offers particular value because of its clinical applicability. Further validation in diverse populations and investigation of the causative pathways of the identified predictors are necessary to enhance clinical utility.

Abstract Image

Abstract Image

Abstract Image

基于机器学习的学龄前儿童哮喘早期预测:COCOA出生队列研究。
背景:早期预测学龄前儿童哮喘对及时干预至关重要,但仍具有挑战性。本研究旨在开发一种基于机器学习(ML)的模型和一种基于问卷的评分工具,用于预测3岁时的哮喘。方法:数据来自儿童哮喘和过敏性疾病起源队列(COCOA),这是韩国的一项综合性前瞻性出生队列。完成3年随访的儿童(n = 2007)根据出生年份分为发展组(n = 1472)和验证组(n = 535)。3岁时的哮喘诊断是基于医生诊断、反复发作的喘息、哮喘治疗或父母报告。基于随机森林的预测模型使用收集到2岁的数据开发,最初通过最小绝对收缩和选择算子(LASSO)回归选择特征。开发了基于问卷的评分工具,并与多种ML算法进行了比较。结果:随着数据的积累,基于ml的预测模型的性能有所提高。在验证队列中,6个月、1年和2年模型的受试者工作特征曲线(AUROC)下面积分别为0.614、0.726和0.774。基于问卷的评分工具的性能(AUROC, 0.790)与基于ml的模型相当。重要的预测因素包括父亲总IgE水平、母亲孕期铁补充、父母哮喘史、坚果过敏史和最近的下呼吸道感染。结论:我们的研究成功地建立了强大的早期哮喘预测模型,并表现出高性能。基于问卷的评分工具提供了特殊的价值,因为它的临床适用性。在不同人群中进一步验证和调查确定的预测因子的致病途径是提高临床实用性的必要条件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信