Hyelee Kim MD, MAS, MS , Bennett L. Leventhal MD , Yun-Joo Koh PhD , Efstathios D. Gennatas MBBS, PhD , Young Shin Kim MD, MPH, MS, PhD
{"title":"Development and Validation of Prediction Models for the Diagnosis of Autism Spectrum Disorder in a Korean General Population","authors":"Hyelee Kim MD, MAS, MS , Bennett L. Leventhal MD , Yun-Joo Koh PhD , Efstathios D. Gennatas MBBS, PhD , Young Shin Kim MD, MPH, MS, PhD","doi":"10.1016/j.jaacop.2024.03.005","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><div>Delays in autism spectrum disorder (ASD) diagnosis and treatment are significant clinical problems that can be addressed by timely, community-based assessment. This study examined tools for identifying ASD in community settings using machine learning (ML) models.</div></div><div><h3>Method</h3><div>This study analyzed population-based cross-sectional studies (2005-2017) of ASD in South Korea. A community sample of 62,083 children was screened using the Autism Spectrum Screening Questionnaire (ASSQ) and teacher/caregiver referrals. Caregivers completed the Behavior Assessment System for Children–2nd Edition (BASC-2) and the Social Responsiveness Scale (SRS). Screen positives were offered a comprehensive clinical evaluation. Among the first-graders in regular elementary schools who completed the diagnostic evaluation (N = 746), supervised ML models (generalized linear model with elastic net regularization [GLMNET], classification and regression tree, random forest, and gradient boosting [GB]) were developed and validated for classification of ASD. Models were developed in the single questionnaire and combined questionnaire datasets, using questionnaire responses and demographic and developmental information.</div></div><div><h3>Results</h3><div>ASD was diagnosed in 46.2% of children (median age, 6.8 years [interquartile range, 6.5-7.1 years]; 71.7% boys). Among single questionnaire models, the BASC GB model demonstrated the best discrimination ability (area under the curve 0.80, 95% CI 0.75-0.83). Area under the curve of the GLMNET model with combined ASSQ, BASC-2, and SRS was the highest, 0.82 (95% CI 0.77-0.89); the predicted risk of ASD by the GB model of combined questionnaires agreed the best with the observed risk of ASD compared with other ML models.</div></div><div><h3>Conclusion</h3><div>Caregiver questionnaire ML models showed future promise for identifying children with ASD in community settings.</div></div><div><h3>Plain language summary</h3><div>To tackle the problem of delayed autism diagnosis, a study in South Korea used machine learning tools to identify autism spectrum disorder (ASD) from a community sample of over 62,000 children. By analyzing questionnaire responses along with developmental data, researchers developed models to classify ASD, with the best model achieving accuracy with an area under the curve (AUC) statistic of 0.82. The findings suggest that machine learning models based on caregiver questionnaires have significant potential for early identification of ASD in community settings. This could lead to more timely interventions for affected children.</div></div>","PeriodicalId":73525,"journal":{"name":"JAACAP open","volume":"3 2","pages":"Pages 302-312"},"PeriodicalIF":0.0000,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAACAP open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949732924000425","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Objective
Delays in autism spectrum disorder (ASD) diagnosis and treatment are significant clinical problems that can be addressed by timely, community-based assessment. This study examined tools for identifying ASD in community settings using machine learning (ML) models.
Method
This study analyzed population-based cross-sectional studies (2005-2017) of ASD in South Korea. A community sample of 62,083 children was screened using the Autism Spectrum Screening Questionnaire (ASSQ) and teacher/caregiver referrals. Caregivers completed the Behavior Assessment System for Children–2nd Edition (BASC-2) and the Social Responsiveness Scale (SRS). Screen positives were offered a comprehensive clinical evaluation. Among the first-graders in regular elementary schools who completed the diagnostic evaluation (N = 746), supervised ML models (generalized linear model with elastic net regularization [GLMNET], classification and regression tree, random forest, and gradient boosting [GB]) were developed and validated for classification of ASD. Models were developed in the single questionnaire and combined questionnaire datasets, using questionnaire responses and demographic and developmental information.
Results
ASD was diagnosed in 46.2% of children (median age, 6.8 years [interquartile range, 6.5-7.1 years]; 71.7% boys). Among single questionnaire models, the BASC GB model demonstrated the best discrimination ability (area under the curve 0.80, 95% CI 0.75-0.83). Area under the curve of the GLMNET model with combined ASSQ, BASC-2, and SRS was the highest, 0.82 (95% CI 0.77-0.89); the predicted risk of ASD by the GB model of combined questionnaires agreed the best with the observed risk of ASD compared with other ML models.
Conclusion
Caregiver questionnaire ML models showed future promise for identifying children with ASD in community settings.
Plain language summary
To tackle the problem of delayed autism diagnosis, a study in South Korea used machine learning tools to identify autism spectrum disorder (ASD) from a community sample of over 62,000 children. By analyzing questionnaire responses along with developmental data, researchers developed models to classify ASD, with the best model achieving accuracy with an area under the curve (AUC) statistic of 0.82. The findings suggest that machine learning models based on caregiver questionnaires have significant potential for early identification of ASD in community settings. This could lead to more timely interventions for affected children.