Arja O Rydin, George Aalbers, Wessel A van Eeden, Femke Lamers, Yuri Milaneschi, Brenda W J H Penninx
{"title":"Predicting incident cardio-metabolic disease among persons with and without depressive and anxiety disorders: a machine learning approach.","authors":"Arja O Rydin, George Aalbers, Wessel A van Eeden, Femke Lamers, Yuri Milaneschi, Brenda W J H Penninx","doi":"10.1007/s00127-025-02857-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>There is a global increase of cardiovascular disease and diabetes (Cardio-Metabolic diseases: CMD). Suffering from depression or anxiety disorders increases the probability of developing CMD. In this study we tested a wide array of predictors for the onset of CMD with Machine Learning (ML), evaluating whether adding detailed psychiatric or biological variables increases predictive performance.</p><p><strong>Methods: </strong>We analysed data from the Netherlands Study of Depression and Anxiety, a longitudinal cohort study (N = 2071), using 368 predictors covering 4 domains (demographic, lifestyle & somatic, psychiatric, and biological markers). CMD onset (24% incidence) over a 9-year follow-up was defined using self-reported stroke, heart disease, diabetes with high fasting glucose levels and (antithrombotic, cardiovascular, or diabetes) medication use (ATC codes C01DA, C01-C05A-B, C07-C09A-B, C01DB, B01, A10A-X). Using different ML methods (Logistic regression, Support vector machine, Random forest, and XGBoost) we tested the predictive performance of single domains and domain combinations.</p><p><strong>Results: </strong>The classifiers performed similarly, therefore the simplest classifier (Logistic regression) was selected. The Area Under the Receiver Operator Characteristic Curve (AUC-ROC) achieved by singe domains ranged from 0.569 to 0.649. The combination of demographics, lifestyle & somatic indicators and psychiatric variables performed best (AUC-ROC = 0.669), but did not significantly outperform demographics. Age and hypertension contributed most to prediction; detailed psychiatric variables added relatively little.</p><p><strong>Conclusion: </strong>In this longitudinal study, ML classifiers were not able to accurately predict 9-year CMD onset in a sample enriched of subjects with psychopathology. Detailed psychiatric/biological information did not substantially increase predictive performance.</p>","PeriodicalId":49510,"journal":{"name":"Social Psychiatry and Psychiatric Epidemiology","volume":" ","pages":""},"PeriodicalIF":3.6000,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Social Psychiatry and Psychiatric Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s00127-025-02857-9","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: There is a global increase of cardiovascular disease and diabetes (Cardio-Metabolic diseases: CMD). Suffering from depression or anxiety disorders increases the probability of developing CMD. In this study we tested a wide array of predictors for the onset of CMD with Machine Learning (ML), evaluating whether adding detailed psychiatric or biological variables increases predictive performance.
Methods: We analysed data from the Netherlands Study of Depression and Anxiety, a longitudinal cohort study (N = 2071), using 368 predictors covering 4 domains (demographic, lifestyle & somatic, psychiatric, and biological markers). CMD onset (24% incidence) over a 9-year follow-up was defined using self-reported stroke, heart disease, diabetes with high fasting glucose levels and (antithrombotic, cardiovascular, or diabetes) medication use (ATC codes C01DA, C01-C05A-B, C07-C09A-B, C01DB, B01, A10A-X). Using different ML methods (Logistic regression, Support vector machine, Random forest, and XGBoost) we tested the predictive performance of single domains and domain combinations.
Results: The classifiers performed similarly, therefore the simplest classifier (Logistic regression) was selected. The Area Under the Receiver Operator Characteristic Curve (AUC-ROC) achieved by singe domains ranged from 0.569 to 0.649. The combination of demographics, lifestyle & somatic indicators and psychiatric variables performed best (AUC-ROC = 0.669), but did not significantly outperform demographics. Age and hypertension contributed most to prediction; detailed psychiatric variables added relatively little.
Conclusion: In this longitudinal study, ML classifiers were not able to accurately predict 9-year CMD onset in a sample enriched of subjects with psychopathology. Detailed psychiatric/biological information did not substantially increase predictive performance.
期刊介绍:
Social Psychiatry and Psychiatric Epidemiology is intended to provide a medium for the prompt publication of scientific contributions concerned with all aspects of the epidemiology of psychiatric disorders - social, biological and genetic.
In addition, the journal has a particular focus on the effects of social conditions upon behaviour and the relationship between psychiatric disorders and the social environment. Contributions may be of a clinical nature provided they relate to social issues, or they may deal with specialised investigations in the fields of social psychology, sociology, anthropology, epidemiology, health service research, health economies or public mental health. We will publish papers on cross-cultural and trans-cultural themes. We do not publish case studies or small case series. While we will publish studies of reliability and validity of new instruments of interest to our readership, we will not publish articles reporting on the performance of established instruments in translation.
Both original work and review articles may be submitted.