Mohammed A Mamun, Jannatul Mawa Misti, Md Emran Hasan, Firoj Al-Mamun, Moneerah Mohammad ALmerab, Johurul Islam, Mohammad Muhit, David Gozal
{"title":"利用机器学习建立青少年白天嗜睡模型的特征贡献和预测准确性:MeLiSA 研究","authors":"Mohammed A Mamun, Jannatul Mawa Misti, Md Emran Hasan, Firoj Al-Mamun, Moneerah Mohammad ALmerab, Johurul Islam, Mohammad Muhit, David Gozal","doi":"10.3390/brainsci14101015","DOIUrl":null,"url":null,"abstract":"<p><p><b>Background:</b> Excessive daytime sleepiness (EDS) among adolescents poses significant risks to academic performance, mental health, and overall well-being. This study examines the prevalence and risk factors of EDS in adolescents in Bangladesh and utilizes machine learning approaches to predict the risk of EDS. <b>Methods:</b> A cross-sectional study was conducted among 1496 adolescents using a structured questionnaire. Data were collected through a two-stage stratified cluster sampling method. Chi-square tests and logistic regression analyses were performed using SPSS. Machine learning models, including Categorical Boosting (CatBoost), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), and Gradient Boosting Machine (GBM), were employed to identify and predict EDS risk factors using Python and Google Colab. <b>Results:</b> The prevalence of EDS in the cohort was 11.6%. SHAP values from the CatBoost model identified self-rated health status, gender, and depression as the most significant predictors of EDS. Among the models, GBM achieved the highest accuracy (90.15%) and precision (88.81%), while CatBoost had comparable accuracy (89.48%) and the lowest log loss (0.25). ROC-AUC analysis showed that CatBoost and GBM performed robustly in distinguishing between EDS and non-EDS cases, with AUC scores of 0.86. Both models demonstrated the superior predictive performance for EDS compared to others. <b>Conclusions:</b> The study emphasizes the role of health and demographic factors in predicting EDS among adolescents in Bangladesh. Machine learning techniques offer valuable insights into the relative contribution of these factors, and can guide targeted interventions. Future research should include longitudinal and interventional studies in diverse settings to improve generalizability and develop effective strategies for managing EDS among adolescents.</p>","PeriodicalId":9095,"journal":{"name":"Brain Sciences","volume":"14 10","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11506069/pdf/","citationCount":"0","resultStr":"{\"title\":\"Feature Contributions and Predictive Accuracy in Modeling Adolescent Daytime Sleepiness Using Machine Learning: The MeLiSA Study.\",\"authors\":\"Mohammed A Mamun, Jannatul Mawa Misti, Md Emran Hasan, Firoj Al-Mamun, Moneerah Mohammad ALmerab, Johurul Islam, Mohammad Muhit, David Gozal\",\"doi\":\"10.3390/brainsci14101015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p><b>Background:</b> Excessive daytime sleepiness (EDS) among adolescents poses significant risks to academic performance, mental health, and overall well-being. This study examines the prevalence and risk factors of EDS in adolescents in Bangladesh and utilizes machine learning approaches to predict the risk of EDS. <b>Methods:</b> A cross-sectional study was conducted among 1496 adolescents using a structured questionnaire. Data were collected through a two-stage stratified cluster sampling method. Chi-square tests and logistic regression analyses were performed using SPSS. Machine learning models, including Categorical Boosting (CatBoost), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), and Gradient Boosting Machine (GBM), were employed to identify and predict EDS risk factors using Python and Google Colab. <b>Results:</b> The prevalence of EDS in the cohort was 11.6%. SHAP values from the CatBoost model identified self-rated health status, gender, and depression as the most significant predictors of EDS. Among the models, GBM achieved the highest accuracy (90.15%) and precision (88.81%), while CatBoost had comparable accuracy (89.48%) and the lowest log loss (0.25). ROC-AUC analysis showed that CatBoost and GBM performed robustly in distinguishing between EDS and non-EDS cases, with AUC scores of 0.86. Both models demonstrated the superior predictive performance for EDS compared to others. <b>Conclusions:</b> The study emphasizes the role of health and demographic factors in predicting EDS among adolescents in Bangladesh. Machine learning techniques offer valuable insights into the relative contribution of these factors, and can guide targeted interventions. Future research should include longitudinal and interventional studies in diverse settings to improve generalizability and develop effective strategies for managing EDS among adolescents.</p>\",\"PeriodicalId\":9095,\"journal\":{\"name\":\"Brain Sciences\",\"volume\":\"14 10\",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-10-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11506069/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Brain Sciences\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.3390/brainsci14101015\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"NEUROSCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Brain Sciences","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.3390/brainsci14101015","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"NEUROSCIENCES","Score":null,"Total":0}
Feature Contributions and Predictive Accuracy in Modeling Adolescent Daytime Sleepiness Using Machine Learning: The MeLiSA Study.
Background: Excessive daytime sleepiness (EDS) among adolescents poses significant risks to academic performance, mental health, and overall well-being. This study examines the prevalence and risk factors of EDS in adolescents in Bangladesh and utilizes machine learning approaches to predict the risk of EDS. Methods: A cross-sectional study was conducted among 1496 adolescents using a structured questionnaire. Data were collected through a two-stage stratified cluster sampling method. Chi-square tests and logistic regression analyses were performed using SPSS. Machine learning models, including Categorical Boosting (CatBoost), Extreme Gradient Boosting (XGBoost), Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), and Gradient Boosting Machine (GBM), were employed to identify and predict EDS risk factors using Python and Google Colab. Results: The prevalence of EDS in the cohort was 11.6%. SHAP values from the CatBoost model identified self-rated health status, gender, and depression as the most significant predictors of EDS. Among the models, GBM achieved the highest accuracy (90.15%) and precision (88.81%), while CatBoost had comparable accuracy (89.48%) and the lowest log loss (0.25). ROC-AUC analysis showed that CatBoost and GBM performed robustly in distinguishing between EDS and non-EDS cases, with AUC scores of 0.86. Both models demonstrated the superior predictive performance for EDS compared to others. Conclusions: The study emphasizes the role of health and demographic factors in predicting EDS among adolescents in Bangladesh. Machine learning techniques offer valuable insights into the relative contribution of these factors, and can guide targeted interventions. Future research should include longitudinal and interventional studies in diverse settings to improve generalizability and develop effective strategies for managing EDS among adolescents.
期刊介绍:
Brain Sciences (ISSN 2076-3425) is a peer-reviewed scientific journal that publishes original articles, critical reviews, research notes and short communications in the areas of cognitive neuroscience, developmental neuroscience, molecular and cellular neuroscience, neural engineering, neuroimaging, neurolinguistics, neuropathy, systems neuroscience, and theoretical and computational neuroscience. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced. Electronic files or software regarding the full details of the calculation and experimental procedure, if unable to be published in a normal way, can be deposited as supplementary material.