Firoj Al-Mamun, Mohammed A Mamun, Md Emran Hasan, Moneerah Mohammad ALmerab, Johurul Islam, Mohammad Muhit
{"title":"Association of risk factors with mental illness in a rural community: insights from machine learning models.","authors":"Firoj Al-Mamun, Mohammed A Mamun, Md Emran Hasan, Moneerah Mohammad ALmerab, Johurul Islam, Mohammad Muhit","doi":"10.1192/bjo.2025.47","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Mental health conditions, particularly depression and anxiety, are highly prevalent and impose substantial health burdens globally. Despite advancements in machine learning, there is limited application of these methods in predicting common mental illnesses within community populations in low-resource settings.</p><p><strong>Aims: </strong>This study aims to examine the prevalence and associated risk factors of common mental illnesses collectively (depression and anxiety) in a rural Bangladeshi community using machine learning models.</p><p><strong>Method: </strong>This cross-sectional study surveyed 490 adults aged 18-59 in a rural Bangladeshi community. Depression and anxiety were assessed using the Patient Health Questionnaire (PHQ-2) and Generalised Anxiety Disorder (GAD-2) scales. Machine learning models, including Categorical Boosting, the support vector machine, the random forest and XGBoost (eXtreme Gradient Boosting), were trained on 80% of the data-set and tested on 20% to evaluate predictive accuracy, precision, F1 score, log-loss and area under the receiver operating characteristic curve (AUC-ROC).</p><p><strong>Results: </strong>Some 20.4% of participants experienced at least one common mental illness. Feature importance analysis identified house type, age group and educational status as the most significant predictors. SHAP (Shapley Additive exPlanations) values highlighted their influence on model outputs, and the XGBoost gain metric confirmed the importance of marital status and house type, with gains of 0.76 and 0.73, respectively. XGBoost delivered the best performance, achieving an F1 score of 71.01%, precision of 71.58%, accuracy of 71.15% and the lowest log-loss value of 0.56. The random forest had an accuracy of 78.21% and an AUC-ROC of 0.90.</p><p><strong>Conclusions: </strong>The findings of this study suggest targeted interventions addressing housing and social determinants could improve mental health outcomes in similar rural settings. Further studies should consider longitudinal data to explore causal relationships.</p>","PeriodicalId":9038,"journal":{"name":"BJPsych Open","volume":"11 3","pages":"e96"},"PeriodicalIF":3.9000,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BJPsych Open","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1192/bjo.2025.47","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Mental health conditions, particularly depression and anxiety, are highly prevalent and impose substantial health burdens globally. Despite advancements in machine learning, there is limited application of these methods in predicting common mental illnesses within community populations in low-resource settings.
Aims: This study aims to examine the prevalence and associated risk factors of common mental illnesses collectively (depression and anxiety) in a rural Bangladeshi community using machine learning models.
Method: This cross-sectional study surveyed 490 adults aged 18-59 in a rural Bangladeshi community. Depression and anxiety were assessed using the Patient Health Questionnaire (PHQ-2) and Generalised Anxiety Disorder (GAD-2) scales. Machine learning models, including Categorical Boosting, the support vector machine, the random forest and XGBoost (eXtreme Gradient Boosting), were trained on 80% of the data-set and tested on 20% to evaluate predictive accuracy, precision, F1 score, log-loss and area under the receiver operating characteristic curve (AUC-ROC).
Results: Some 20.4% of participants experienced at least one common mental illness. Feature importance analysis identified house type, age group and educational status as the most significant predictors. SHAP (Shapley Additive exPlanations) values highlighted their influence on model outputs, and the XGBoost gain metric confirmed the importance of marital status and house type, with gains of 0.76 and 0.73, respectively. XGBoost delivered the best performance, achieving an F1 score of 71.01%, precision of 71.58%, accuracy of 71.15% and the lowest log-loss value of 0.56. The random forest had an accuracy of 78.21% and an AUC-ROC of 0.90.
Conclusions: The findings of this study suggest targeted interventions addressing housing and social determinants could improve mental health outcomes in similar rural settings. Further studies should consider longitudinal data to explore causal relationships.
期刊介绍:
Announcing the launch of BJPsych Open, an exciting new open access online journal for the publication of all methodologically sound research in all fields of psychiatry and disciplines related to mental health. BJPsych Open will maintain the highest scientific, peer review, and ethical standards of the BJPsych, ensure rapid publication for authors whilst sharing research with no cost to the reader in the spirit of maximising dissemination and public engagement. Cascade submission from BJPsych to BJPsych Open is a new option for authors whose first priority is rapid online publication with the prestigious BJPsych brand. Authors will also retain copyright to their works under a creative commons license.