Identifying individuals at risk of post-stroke depression: Development and validation of a predictive model.

IF 1.5 4区 医学 Q2 MEDICINE, GENERAL & INTERNAL
Saeed A Alqahtani
{"title":"Identifying individuals at risk of post-stroke depression: Development and validation of a predictive model.","authors":"Saeed A Alqahtani","doi":"10.15537/smj.2025.46.5.20250080","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>To identify the factors associated with post-stroke depression (PSD) and develop a machine learning predictive model using a large dataset, considering sociodemographic, lifestyle, and clinical factors.</p><p><strong>Methods: </strong>Our 2025 study used data from the 2023 Behavioral Risk Factor Surveillance System, released in September 2024. Data processing was carried out using Google Colab and Python. We carried out descriptive statistics, logistic regression, and feature importance analyses (mutual information and adjusted mutual information). A total of 4 machine-learning models were trained and evaluated: random forest, decision tree, gradient boosting, and logistic regression. Model performance was assessed using the accuracy, precision, recall, harmonic mean of precision and recall (F1-score), and area under the curve - receiver operating characteristic (AUC-ROC). The best-performing model was fine-tuned using GridSearchCV with 5-fold cross-validation.</p><p><strong>Results: </strong>Increasing age, male gender, being married, higher income, and physical activity were associated with lower odds of PSD. Obesity, smoking, diabetes, and high cholesterol are associated with increased odds of PSD. Age and gender were the most informative features for predicting the PSD. Random forest demonstrated the best performance for predicting PSD (accuracy=0.73, precision=0.71, recall=0.77, F1-score=0.74, and AUC-ROC=0.81), which was further improved by hyperparameter optimization.</p><p><strong>Conclusion: </strong>Post-stroke depression's complex etiology involves sociodemographic, lifestyle, and clinical factors, notably age and gender. A random forest model effectively predicts PSD, highlighting the need for comprehensive assessment, early intervention, and management of modifiable risks (obesity, smoking, and inactivity) to improve stroke survivors' outcomes.</p>","PeriodicalId":21453,"journal":{"name":"Saudi Medical Journal","volume":"46 5","pages":"497-506"},"PeriodicalIF":1.5000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12074046/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Saudi Medical Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.15537/smj.2025.46.5.20250080","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

Abstract

Objectives: To identify the factors associated with post-stroke depression (PSD) and develop a machine learning predictive model using a large dataset, considering sociodemographic, lifestyle, and clinical factors.

Methods: Our 2025 study used data from the 2023 Behavioral Risk Factor Surveillance System, released in September 2024. Data processing was carried out using Google Colab and Python. We carried out descriptive statistics, logistic regression, and feature importance analyses (mutual information and adjusted mutual information). A total of 4 machine-learning models were trained and evaluated: random forest, decision tree, gradient boosting, and logistic regression. Model performance was assessed using the accuracy, precision, recall, harmonic mean of precision and recall (F1-score), and area under the curve - receiver operating characteristic (AUC-ROC). The best-performing model was fine-tuned using GridSearchCV with 5-fold cross-validation.

Results: Increasing age, male gender, being married, higher income, and physical activity were associated with lower odds of PSD. Obesity, smoking, diabetes, and high cholesterol are associated with increased odds of PSD. Age and gender were the most informative features for predicting the PSD. Random forest demonstrated the best performance for predicting PSD (accuracy=0.73, precision=0.71, recall=0.77, F1-score=0.74, and AUC-ROC=0.81), which was further improved by hyperparameter optimization.

Conclusion: Post-stroke depression's complex etiology involves sociodemographic, lifestyle, and clinical factors, notably age and gender. A random forest model effectively predicts PSD, highlighting the need for comprehensive assessment, early intervention, and management of modifiable risks (obesity, smoking, and inactivity) to improve stroke survivors' outcomes.

Abstract Image

Abstract Image

识别有卒中后抑郁风险的个体:预测模型的开发和验证。
目的:确定与脑卒中后抑郁(PSD)相关的因素,并利用大型数据集开发机器学习预测模型,考虑社会人口统计学、生活方式和临床因素。方法:我们的2025研究使用了2024年9月发布的2023行为风险因素监测系统的数据。使用谷歌Colab和Python进行数据处理。我们进行了描述性统计、逻辑回归和特征重要性分析(互信息和调整互信息)。总共训练和评估了4种机器学习模型:随机森林、决策树、梯度增强和逻辑回归。采用准确率、精密度、查全率、查全率和查全率的调和平均值(f1分数)以及曲线下面积-接收者工作特征(AUC-ROC)来评估模型的性能。使用GridSearchCV进行5倍交叉验证,对表现最佳的模型进行微调。结果:年龄增加、男性、已婚、高收入和体育锻炼与患PSD的几率降低有关。肥胖、吸烟、糖尿病和高胆固醇与患PSD的几率增加有关。年龄和性别是预测PSD最重要的信息特征。随机森林预测PSD的准确率为0.73,精密度为0.71,召回率为0.77,F1-score为0.74,AUC-ROC为0.81,并通过超参数优化进一步提高了预测精度。结论:脑卒中后抑郁的病因复杂,涉及社会人口学、生活方式和临床因素,尤其是年龄和性别因素。随机森林模型有效预测PSD,强调需要全面评估、早期干预和管理可改变的风险(肥胖、吸烟和缺乏运动),以改善中风幸存者的预后。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Saudi Medical Journal
Saudi Medical Journal 医学-医学:内科
CiteScore
2.30
自引率
6.20%
发文量
203
审稿时长
12 months
期刊介绍: The Saudi Medical Journal is a monthly peer-reviewed medical journal. It is an open access journal, with content released under a Creative Commons attribution-noncommercial license. The journal publishes original research articles, review articles, Systematic Reviews, Case Reports, Brief Communication, Brief Report, Clinical Note, Clinical Image, Editorials, Book Reviews, Correspondence, and Student Corner.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信