Development of an explainable machine learning model for predicting depression in adolescent girls with non-suicidal self-injury: A cross-sectional multicenter study
{"title":"Development of an explainable machine learning model for predicting depression in adolescent girls with non-suicidal self-injury: A cross-sectional multicenter study","authors":"Ben Niu , Mengjie Wan , Yongjie Zhou","doi":"10.1016/j.jad.2025.03.080","DOIUrl":null,"url":null,"abstract":"<div><div>Non-suicidal self-injury (NSSI) in adolescent girls is a critical predictor of subsequent depression and suicide risk, yet current tools lack both accuracy and clinical interpretability. We developed the first explainable machine learning model integrating multicenter psychosocial data to predict depression among Chinese adolescent girls with NSSI, addressing the critical need for culturally tailored risk stratification tools. In this cross - sectional observational study, our model was developed using data from 14 hospitals. We used five categories of data as predictors, including individual, family, school, psychosocial, and behavioral and lifestyle factors. We compared seven machine learning models and selected the best one to develop final model and the Shapley Additive exPlanations (SHAP) method were used to explain model prediction. The Random Forest (RF) model was compared against six other machine learning algorithms. We assessed the discrimination using the area under receiver operating characteristic (AUROC) with 95 % CIs. Using the development dataset (<em>n</em> = 1163) and predictive model building process, a simplified model containing only the top 20 features had similar predictive performance to the full model, the RF model outperformed six algorithms (AUROC = 0.964 [0.945–0.975]), demonstrating superior discriminative power and robustness. The top ten risk predictors were Borderline personality, Rumination, Perceived stress, Hopelessness, Self-esteem, Sleep quality, Loneliness, Resilience, Parental care, and Problem-focused coping. We developed a three-tiered, color-coded web-based clinical tool to operationalize predictions, enabling real-time risk stratification and personalized interventions. Our study bridges machine learning and clinical interpretability to advance precision mental health interventions for vulnerable adolescent populations.</div></div>","PeriodicalId":14963,"journal":{"name":"Journal of affective disorders","volume":"379 ","pages":"Pages 690-702"},"PeriodicalIF":4.9000,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of affective disorders","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165032725004264","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Non-suicidal self-injury (NSSI) in adolescent girls is a critical predictor of subsequent depression and suicide risk, yet current tools lack both accuracy and clinical interpretability. We developed the first explainable machine learning model integrating multicenter psychosocial data to predict depression among Chinese adolescent girls with NSSI, addressing the critical need for culturally tailored risk stratification tools. In this cross - sectional observational study, our model was developed using data from 14 hospitals. We used five categories of data as predictors, including individual, family, school, psychosocial, and behavioral and lifestyle factors. We compared seven machine learning models and selected the best one to develop final model and the Shapley Additive exPlanations (SHAP) method were used to explain model prediction. The Random Forest (RF) model was compared against six other machine learning algorithms. We assessed the discrimination using the area under receiver operating characteristic (AUROC) with 95 % CIs. Using the development dataset (n = 1163) and predictive model building process, a simplified model containing only the top 20 features had similar predictive performance to the full model, the RF model outperformed six algorithms (AUROC = 0.964 [0.945–0.975]), demonstrating superior discriminative power and robustness. The top ten risk predictors were Borderline personality, Rumination, Perceived stress, Hopelessness, Self-esteem, Sleep quality, Loneliness, Resilience, Parental care, and Problem-focused coping. We developed a three-tiered, color-coded web-based clinical tool to operationalize predictions, enabling real-time risk stratification and personalized interventions. Our study bridges machine learning and clinical interpretability to advance precision mental health interventions for vulnerable adolescent populations.
期刊介绍:
The Journal of Affective Disorders publishes papers concerned with affective disorders in the widest sense: depression, mania, mood spectrum, emotions and personality, anxiety and stress. It is interdisciplinary and aims to bring together different approaches for a diverse readership. Top quality papers will be accepted dealing with any aspect of affective disorders, including neuroimaging, cognitive neurosciences, genetics, molecular biology, experimental and clinical neurosciences, pharmacology, neuroimmunoendocrinology, intervention and treatment trials.