A Clinical Risk Prediction Model for Depressive Disorders Based on Seven Machine Learning Algorithms.

IF 2.1 4区医学 Q2 MEDICINE, GENERAL & INTERNAL

International Journal of General Medicine Pub Date : 2025-05-08 eCollection Date: 2025-01-01 DOI:10.2147/IJGM.S524016

Weifeng Jin, Shuzi Chen, Mengxia Wang, Ping Lin

{"title":"A Clinical Risk Prediction Model for Depressive Disorders Based on Seven Machine Learning Algorithms.","authors":"Weifeng Jin, Shuzi Chen, Mengxia Wang, Ping Lin","doi":"10.2147/IJGM.S524016","DOIUrl":null,"url":null,"abstract":"Objective: To develop a clinical risk prediction model for depressive disorders using seven machine learning algorithms based on routine blood test indicators.Methods: A retrospective study was conducted, involving 284 patients with depressive disorders and 214 healthy controls recruited between January and October 2024. Clinical data, including age, sex, and routine blood test results, were collected. The dataset was randomly divided into a training set (70%; n=348) and a test set (30%; n=150). Univariate logistic regression analysis (p<0.1) was initially performed to identify potential predictors, followed by feature selection using the Boruta and LASSO algorithms. Seven machine learning algorithms were employed to construct predictive models, with their performance evaluated using metrics such as AUC, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), precision, recall, and F1 score. A multivariable logistic regression model was subsequently used to develop a nomogram, and its discrimination, calibration, and clinical utility were comprehensively assessed.Results: Four significant predictors (alkaline phosphatase [AKP], serotonin, phenylalanine [Phe], and arginine [Arg]) were identified through univariate logistic regression combined with Boruta and LASSO feature selection. Among the seven algorithms, the random forest model exhibited the highest AUC, achieving an AUC of 1.000 (95% CI: 1.000-1.000) in the training set and 0.958 (95% CI: 0.931-0.985) in the test set. However, due to concerns about potential overfitting, the multivariable logistic regression model was selected as the final predictive model. A nomogram was constructed based on this model.Conclusion: This study successfully developed a clinically interpretable risk prediction model for depressive disorders by integrating machine learning algorithms and routine blood test indicators. The logistic regression model demonstrated robust performance across all metrics and holds potential as a reliable auxiliary tool for the diagnosis of depressive disorders.","PeriodicalId":14131,"journal":{"name":"International Journal of General Medicine","volume":"18 ","pages":"2461-2473"},"PeriodicalIF":2.1000,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12069924/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of General Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2147/IJGM.S524016","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: To develop a clinical risk prediction model for depressive disorders using seven machine learning algorithms based on routine blood test indicators.

Methods: A retrospective study was conducted, involving 284 patients with depressive disorders and 214 healthy controls recruited between January and October 2024. Clinical data, including age, sex, and routine blood test results, were collected. The dataset was randomly divided into a training set (70%; n=348) and a test set (30%; n=150). Univariate logistic regression analysis (p<0.1) was initially performed to identify potential predictors, followed by feature selection using the Boruta and LASSO algorithms. Seven machine learning algorithms were employed to construct predictive models, with their performance evaluated using metrics such as AUC, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), precision, recall, and F1 score. A multivariable logistic regression model was subsequently used to develop a nomogram, and its discrimination, calibration, and clinical utility were comprehensively assessed.

Results: Four significant predictors (alkaline phosphatase [AKP], serotonin, phenylalanine [Phe], and arginine [Arg]) were identified through univariate logistic regression combined with Boruta and LASSO feature selection. Among the seven algorithms, the random forest model exhibited the highest AUC, achieving an AUC of 1.000 (95% CI: 1.000-1.000) in the training set and 0.958 (95% CI: 0.931-0.985) in the test set. However, due to concerns about potential overfitting, the multivariable logistic regression model was selected as the final predictive model. A nomogram was constructed based on this model.

Conclusion: This study successfully developed a clinically interpretable risk prediction model for depressive disorders by integrating machine learning algorithms and routine blood test indicators. The logistic regression model demonstrated robust performance across all metrics and holds potential as a reliable auxiliary tool for the diagnosis of depressive disorders.

查看原文本刊更多论文

基于7种机器学习算法的抑郁症临床风险预测模型

目的：利用7种机器学习算法建立基于血常规指标的抑郁症临床风险预测模型。方法：对2024年1 - 10月招募的284例抑郁症患者和214例健康对照者进行回顾性研究。收集临床资料，包括年龄、性别和血常规检查结果。数据集被随机分成训练集(70%；N =348)和一个测试集(30%；n = 150)。单因素logistic回归分析(结果：通过单因素logistic回归结合Boruta和LASSO特征选择，确定了碱性磷酸酶（AKP）、血清素（serotonin）、苯丙氨酸（Phe）和精氨酸（Arg）四个显著预测因子。7种算法中，随机森林模型的AUC最高，训练集的AUC为1.000 (95% CI: 1.000 ~ 1.000)，测试集的AUC为0.958 （95% CI: 0.931 ~ 0.985）。然而，由于担心潜在的过拟合，我们选择了多变量逻辑回归模型作为最终的预测模型。在此模型的基础上构造了一个nomogram。结论：本研究将机器学习算法与常规血液检测指标相结合，成功建立了临床可解释的抑郁症风险预测模型。逻辑回归模型在所有指标上表现出稳健的表现，并有可能成为抑郁症诊断的可靠辅助工具。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of General Medicine Medicine-General Medicine

自引率

0.00%

发文量

1113

审稿时长

16 weeks

期刊介绍： The International Journal of General Medicine is an international, peer-reviewed, open access journal that focuses on general and internal medicine, pathogenesis, epidemiology, diagnosis, monitoring and treatment protocols. The journal is characterized by the rapid reporting of reviews, original research and clinical studies across all disease areas. A key focus of the journal is the elucidation of disease processes and management protocols resulting in improved outcomes for the patient. Patient perspectives such as satisfaction, quality of life, health literacy and communication and their role in developing new healthcare programs and optimizing clinical outcomes are major areas of interest for the journal. As of 1st April 2019, the International Journal of General Medicine will no longer consider meta-analyses for publication.