Johannes Massell, Martin Preisig, Marcel Miché, Marie-Pierre F Strippoli, Giorgio Pistis, Roselind Lieb
{"title":"使用机器学习对中年首次发病的重度抑郁症进行前瞻性预测。","authors":"Johannes Massell, Martin Preisig, Marcel Miché, Marie-Pierre F Strippoli, Giorgio Pistis, Roselind Lieb","doi":"10.1007/s00127-025-02942-z","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>In this paper we leverage machine learning (ML) models to prospectively predict the first onset of Major Depressive Disorder (MDD), one of the most common and disabling mental health conditions. While such prediction models hold potential for enabling early interventions, few studies have applied ML approaches to this task, and those that have are heterogeneous in nature. Moreover, the clinical utility of these predictive models remains largely unexamined.</p><p><strong>Methods: </strong>Data stemmed from CoLaus|PsyCoLaus, a population-based cohort study. In total, 1350 participants, age 35-66 years without lifetime MDD at baseline participated in the physical and psychiatric baseline and at least one psychiatric follow-up evaluation. Models based on logistic regression, elastic net, random forests, and XGBoost were trained using an extensive array of psychosocial, environmental, biological, and genetic predictors. Discriminative performance, calibration, clinical utility, and individual predictor contributions were assessed using nested cross-validation.</p><p><strong>Results: </strong>Discriminative performance was comparable between models (areas under the precision-recall curve between 0.36 and 0.38; areas under the receiver operating characteristic curve between 0.65 and 0.68). Decision curve analysis suggested clinical utility of logistic regression, elastic net, and random forests for threshold probabilities between 10% and 40%. Across all models, neuroticism, sex, and age were the most important predictors.</p><p><strong>Conclusions: </strong>Although the prediction models achieved discriminative performance levels above chance, further refinement is necessary. The addition of biological and genetic predictors did not elevate performance markedly. Additional research seems warranted given the limited number and heterogeneous nature of existing studies, the burden associated with MDD, and the potential to improve overall outcomes for people at risk for MDD.</p>","PeriodicalId":49510,"journal":{"name":"Social Psychiatry and Psychiatric Epidemiology","volume":" ","pages":"2387-2400"},"PeriodicalIF":3.5000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12449384/pdf/","citationCount":"0","resultStr":"{\"title\":\"Prospective prediction of first onset of major depressive disorder in midlife using machine learning.\",\"authors\":\"Johannes Massell, Martin Preisig, Marcel Miché, Marie-Pierre F Strippoli, Giorgio Pistis, Roselind Lieb\",\"doi\":\"10.1007/s00127-025-02942-z\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>In this paper we leverage machine learning (ML) models to prospectively predict the first onset of Major Depressive Disorder (MDD), one of the most common and disabling mental health conditions. While such prediction models hold potential for enabling early interventions, few studies have applied ML approaches to this task, and those that have are heterogeneous in nature. Moreover, the clinical utility of these predictive models remains largely unexamined.</p><p><strong>Methods: </strong>Data stemmed from CoLaus|PsyCoLaus, a population-based cohort study. In total, 1350 participants, age 35-66 years without lifetime MDD at baseline participated in the physical and psychiatric baseline and at least one psychiatric follow-up evaluation. Models based on logistic regression, elastic net, random forests, and XGBoost were trained using an extensive array of psychosocial, environmental, biological, and genetic predictors. Discriminative performance, calibration, clinical utility, and individual predictor contributions were assessed using nested cross-validation.</p><p><strong>Results: </strong>Discriminative performance was comparable between models (areas under the precision-recall curve between 0.36 and 0.38; areas under the receiver operating characteristic curve between 0.65 and 0.68). Decision curve analysis suggested clinical utility of logistic regression, elastic net, and random forests for threshold probabilities between 10% and 40%. Across all models, neuroticism, sex, and age were the most important predictors.</p><p><strong>Conclusions: </strong>Although the prediction models achieved discriminative performance levels above chance, further refinement is necessary. The addition of biological and genetic predictors did not elevate performance markedly. Additional research seems warranted given the limited number and heterogeneous nature of existing studies, the burden associated with MDD, and the potential to improve overall outcomes for people at risk for MDD.</p>\",\"PeriodicalId\":49510,\"journal\":{\"name\":\"Social Psychiatry and Psychiatric Epidemiology\",\"volume\":\" \",\"pages\":\"2387-2400\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2025-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12449384/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Social Psychiatry and Psychiatric Epidemiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s00127-025-02942-z\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/6/18 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHIATRY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Social Psychiatry and Psychiatric Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00127-025-02942-z","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/6/18 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}
Prospective prediction of first onset of major depressive disorder in midlife using machine learning.
Purpose: In this paper we leverage machine learning (ML) models to prospectively predict the first onset of Major Depressive Disorder (MDD), one of the most common and disabling mental health conditions. While such prediction models hold potential for enabling early interventions, few studies have applied ML approaches to this task, and those that have are heterogeneous in nature. Moreover, the clinical utility of these predictive models remains largely unexamined.
Methods: Data stemmed from CoLaus|PsyCoLaus, a population-based cohort study. In total, 1350 participants, age 35-66 years without lifetime MDD at baseline participated in the physical and psychiatric baseline and at least one psychiatric follow-up evaluation. Models based on logistic regression, elastic net, random forests, and XGBoost were trained using an extensive array of psychosocial, environmental, biological, and genetic predictors. Discriminative performance, calibration, clinical utility, and individual predictor contributions were assessed using nested cross-validation.
Results: Discriminative performance was comparable between models (areas under the precision-recall curve between 0.36 and 0.38; areas under the receiver operating characteristic curve between 0.65 and 0.68). Decision curve analysis suggested clinical utility of logistic regression, elastic net, and random forests for threshold probabilities between 10% and 40%. Across all models, neuroticism, sex, and age were the most important predictors.
Conclusions: Although the prediction models achieved discriminative performance levels above chance, further refinement is necessary. The addition of biological and genetic predictors did not elevate performance markedly. Additional research seems warranted given the limited number and heterogeneous nature of existing studies, the burden associated with MDD, and the potential to improve overall outcomes for people at risk for MDD.
期刊介绍:
Social Psychiatry and Psychiatric Epidemiology is intended to provide a medium for the prompt publication of scientific contributions concerned with all aspects of the epidemiology of psychiatric disorders - social, biological and genetic.
In addition, the journal has a particular focus on the effects of social conditions upon behaviour and the relationship between psychiatric disorders and the social environment. Contributions may be of a clinical nature provided they relate to social issues, or they may deal with specialised investigations in the fields of social psychology, sociology, anthropology, epidemiology, health service research, health economies or public mental health. We will publish papers on cross-cultural and trans-cultural themes. We do not publish case studies or small case series. While we will publish studies of reliability and validity of new instruments of interest to our readership, we will not publish articles reporting on the performance of established instruments in translation.
Both original work and review articles may be submitted.