Feature Contributions and Predictive Accuracy in Modeling Adolescent Daytime Sleepiness Using Machine Learning: The MeLiSA Study.

IF 2.7 3区 医学 Q3 NEUROSCIENCES
Mohammed A Mamun, Jannatul Mawa Misti, Md Emran Hasan, Firoj Al-Mamun, Moneerah Mohammad ALmerab, Johurul Islam, Mohammad Muhit, David Gozal
{"title":"Feature Contributions and Predictive Accuracy in Modeling Adolescent Daytime Sleepiness Using Machine Learning: The MeLiSA Study.","authors":"Mohammed A Mamun, Jannatul Mawa Misti, Md Emran Hasan, Firoj Al-Mamun, Moneerah Mohammad ALmerab, Johurul Islam, Mohammad Muhit, David Gozal","doi":"10.3390/brainsci14101015","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background:</b> Excessive daytime sleepiness (EDS) among adolescents poses significant risks to academic performance, mental health, and overall well-being. This study examines the prevalence and risk factors of EDS in adolescents in Bangladesh and utilizes machine learning approaches to predict the risk of EDS. <b>Methods:</b> A cross-sectional study was conducted among 1496 adolescents using a structured questionnaire. Data were collected through a two-stage stratified cluster sampling method. Chi-square tests and logistic regression analyses were performed using SPSS. Machine learning models, including Categorical Boosting (CatBoost), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), and Gradient Boosting Machine (GBM), were employed to identify and predict EDS risk factors using Python and Google Colab. <b>Results:</b> The prevalence of EDS in the cohort was 11.6%. SHAP values from the CatBoost model identified self-rated health status, gender, and depression as the most significant predictors of EDS. Among the models, GBM achieved the highest accuracy (90.15%) and precision (88.81%), while CatBoost had comparable accuracy (89.48%) and the lowest log loss (0.25). ROC-AUC analysis showed that CatBoost and GBM performed robustly in distinguishing between EDS and non-EDS cases, with AUC scores of 0.86. Both models demonstrated the superior predictive performance for EDS compared to others. <b>Conclusions:</b> The study emphasizes the role of health and demographic factors in predicting EDS among adolescents in Bangladesh. Machine learning techniques offer valuable insights into the relative contribution of these factors, and can guide targeted interventions. Future research should include longitudinal and interventional studies in diverse settings to improve generalizability and develop effective strategies for managing EDS among adolescents.</p>","PeriodicalId":9095,"journal":{"name":"Brain Sciences","volume":"14 10","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11506069/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Brain Sciences","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3390/brainsci14101015","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"NEUROSCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Background: Excessive daytime sleepiness (EDS) among adolescents poses significant risks to academic performance, mental health, and overall well-being. This study examines the prevalence and risk factors of EDS in adolescents in Bangladesh and utilizes machine learning approaches to predict the risk of EDS. Methods: A cross-sectional study was conducted among 1496 adolescents using a structured questionnaire. Data were collected through a two-stage stratified cluster sampling method. Chi-square tests and logistic regression analyses were performed using SPSS. Machine learning models, including Categorical Boosting (CatBoost), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), and Gradient Boosting Machine (GBM), were employed to identify and predict EDS risk factors using Python and Google Colab. Results: The prevalence of EDS in the cohort was 11.6%. SHAP values from the CatBoost model identified self-rated health status, gender, and depression as the most significant predictors of EDS. Among the models, GBM achieved the highest accuracy (90.15%) and precision (88.81%), while CatBoost had comparable accuracy (89.48%) and the lowest log loss (0.25). ROC-AUC analysis showed that CatBoost and GBM performed robustly in distinguishing between EDS and non-EDS cases, with AUC scores of 0.86. Both models demonstrated the superior predictive performance for EDS compared to others. Conclusions: The study emphasizes the role of health and demographic factors in predicting EDS among adolescents in Bangladesh. Machine learning techniques offer valuable insights into the relative contribution of these factors, and can guide targeted interventions. Future research should include longitudinal and interventional studies in diverse settings to improve generalizability and develop effective strategies for managing EDS among adolescents.

利用机器学习建立青少年白天嗜睡模型的特征贡献和预测准确性:MeLiSA 研究
背景:青少年白天过度嗜睡(EDS)对学习成绩、心理健康和整体福祉构成重大风险。本研究调查了孟加拉国青少年中 EDS 的患病率和风险因素,并利用机器学习方法预测 EDS 的风险。研究方法使用结构化问卷对 1496 名青少年进行了横断面研究。数据通过两阶段分层分组抽样法收集。使用 SPSS 进行了卡方检验和逻辑回归分析。使用Python和Google Colab建立了机器学习模型,包括分类提升(CatBoost)、极梯度提升(XGBoost)、支持向量机(SVM)、随机森林(RF)、K-近邻(KNN)和梯度提升机(GBM),以识别和预测EDS风险因素。结果队列中 EDS 的患病率为 11.6%。CatBoost 模型的 SHAP 值确定了自评健康状况、性别和抑郁是 EDS 最重要的预测因素。在各种模型中,GBM 的准确率(90.15%)和精确度(88.81%)最高,而 CatBoost 的准确率(89.48%)和对数损失(0.25)最低。ROC-AUC 分析表明,CatBoost 和 GBM 在区分 EDS 和非 EDS 病例方面表现出色,AUC 得分为 0.86。与其他模型相比,这两种模型对 EDS 的预测性能更优。结论:本研究强调了健康和人口因素在预测孟加拉国青少年 EDS 中的作用。机器学习技术为了解这些因素的相对作用提供了宝贵的见解,并能指导有针对性的干预措施。未来的研究应包括在不同环境中进行纵向和干预性研究,以提高可推广性,并为管理青少年的 EDS 制定有效策略。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Brain Sciences
Brain Sciences Neuroscience-General Neuroscience
CiteScore
4.80
自引率
9.10%
发文量
1472
审稿时长
18.71 days
期刊介绍: Brain Sciences (ISSN 2076-3425) is a peer-reviewed scientific journal that publishes original articles, critical reviews, research notes and short communications in the areas of cognitive neuroscience, developmental neuroscience, molecular and cellular neuroscience, neural engineering, neuroimaging, neurolinguistics, neuropathy, systems neuroscience, and theoretical and computational neuroscience. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced. Electronic files or software regarding the full details of the calculation and experimental procedure, if unable to be published in a normal way, can be deposited as supplementary material.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信