Identifying most important predictors for suicidal thoughts and behaviours among healthcare workers active during the Spain COVID-19 pandemic: a machine-learning approach.

IF 5.9 2区 医学 Q1 PSYCHIATRY
Itxaso Alayo, Oriol Pujol, Jordi Alonso, Montse Ferrer, Franco Amigo, Ana Portillo-Van Diest, Enric Aragonès, Andrés Aragon Peña, Ángel Asúnsolo Del Barco, Mireia Campos, Meritxell Espuga, Ana González-Pinto, Josep Maria Haro, Nieves López-Fresneña, Alma D Martínez de Salázar, Juan D Molina, Rafael M Ortí-Lucas, Mara Parellada, José Maria Pelayo-Terán, Maria João Forjaz, Aurora Pérez-Zapata, José Ignacio Pijoan, Nieves Plana, Elena Polentinos-Castro, Maria Teresa Puig, Cristina Rius, Ferran Sanz, Cònsol Serra, Iratxe Urreta-Barallobre, Ronny Bruffaerts, Eduard Vieta, Víctor Pérez-Solá, Philippe Mortier, Gemma Vilagut
{"title":"Identifying most important predictors for suicidal thoughts and behaviours among healthcare workers active during the Spain COVID-19 pandemic: a machine-learning approach.","authors":"Itxaso Alayo, Oriol Pujol, Jordi Alonso, Montse Ferrer, Franco Amigo, Ana Portillo-Van Diest, Enric Aragonès, Andrés Aragon Peña, Ángel Asúnsolo Del Barco, Mireia Campos, Meritxell Espuga, Ana González-Pinto, Josep Maria Haro, Nieves López-Fresneña, Alma D Martínez de Salázar, Juan D Molina, Rafael M Ortí-Lucas, Mara Parellada, José Maria Pelayo-Terán, Maria João Forjaz, Aurora Pérez-Zapata, José Ignacio Pijoan, Nieves Plana, Elena Polentinos-Castro, Maria Teresa Puig, Cristina Rius, Ferran Sanz, Cònsol Serra, Iratxe Urreta-Barallobre, Ronny Bruffaerts, Eduard Vieta, Víctor Pérez-Solá, Philippe Mortier, Gemma Vilagut","doi":"10.1017/S2045796025000198","DOIUrl":null,"url":null,"abstract":"<p><strong>Aims: </strong>Studies conducted during the COVID-19 pandemic found high occurrence of suicidal thoughts and behaviours (STBs) among healthcare workers (HCWs). The current study aimed to (1) develop a machine learning-based prediction model for future STBs using data from a large prospective cohort of Spanish HCWs and (2) identify the most important variables in terms of contribution to the model's predictive accuracy.</p><p><strong>Methods: </strong>This is a prospective, multicentre cohort study of Spanish HCWs active during the COVID-19 pandemic. A total of 8,996 HCWs participated in the web-based baseline survey (May-July 2020) and 4,809 in the 4-month follow-up survey. A total of 219 predictor variables were derived from the baseline survey. The outcome variable was any STB at the 4-month follow-up. Variable selection was done using an L1 regularized linear Support Vector Classifier (SVC). A random forest model with 5-fold cross-validation was developed, in which the Synthetic Minority Oversampling Technique (SMOTE) and undersampling of the majority class balancing techniques were tested. The model was evaluated by the area under the Receiver Operating Characteristic (AUROC) curve and the area under the precision-recall curve. Shapley's additive explanatory values (SHAP values) were used to evaluate the overall contribution of each variable to the prediction of future STBs. Results were obtained separately by gender.</p><p><strong>Results: </strong>The prevalence of STBs in HCWs at the 4-month follow-up was 7.9% (women = 7.8%, men = 8.2%). Thirty-four variables were selected by the L1 regularized linear SVC. The best results were obtained without data balancing techniques: AUROC = 0.87 (0.86 for women and 0.87 for men) and area under the precision-recall curve = 0.50 (0.55 for women and 0.45 for men). Based on SHAP values, the most important baseline predictors for any STB at the 4-month follow-up were the presence of passive suicidal ideation, the number of days in the past 30 days with passive or active suicidal ideation, the number of days in the past 30 days with binge eating episodes, the number of panic attacks (women only) and the frequency of intrusive thoughts (men only).</p><p><strong>Conclusions: </strong>Machine learning-based prediction models for STBs in HCWs during the COVID-19 pandemic trained on web-based survey data present high discrimination and classification capacity. Future clinical implementations of this model could enable the early detection of HCWs at the highest risk for developing adverse mental health outcomes.</p><p><strong>Study registration: </strong>NCT04556565.</p>","PeriodicalId":11787,"journal":{"name":"Epidemiology and Psychiatric Sciences","volume":"34 ","pages":"e28"},"PeriodicalIF":5.9000,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12090031/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epidemiology and Psychiatric Sciences","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1017/S2045796025000198","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHIATRY","Score":null,"Total":0}
引用次数: 0

Abstract

Aims: Studies conducted during the COVID-19 pandemic found high occurrence of suicidal thoughts and behaviours (STBs) among healthcare workers (HCWs). The current study aimed to (1) develop a machine learning-based prediction model for future STBs using data from a large prospective cohort of Spanish HCWs and (2) identify the most important variables in terms of contribution to the model's predictive accuracy.

Methods: This is a prospective, multicentre cohort study of Spanish HCWs active during the COVID-19 pandemic. A total of 8,996 HCWs participated in the web-based baseline survey (May-July 2020) and 4,809 in the 4-month follow-up survey. A total of 219 predictor variables were derived from the baseline survey. The outcome variable was any STB at the 4-month follow-up. Variable selection was done using an L1 regularized linear Support Vector Classifier (SVC). A random forest model with 5-fold cross-validation was developed, in which the Synthetic Minority Oversampling Technique (SMOTE) and undersampling of the majority class balancing techniques were tested. The model was evaluated by the area under the Receiver Operating Characteristic (AUROC) curve and the area under the precision-recall curve. Shapley's additive explanatory values (SHAP values) were used to evaluate the overall contribution of each variable to the prediction of future STBs. Results were obtained separately by gender.

Results: The prevalence of STBs in HCWs at the 4-month follow-up was 7.9% (women = 7.8%, men = 8.2%). Thirty-four variables were selected by the L1 regularized linear SVC. The best results were obtained without data balancing techniques: AUROC = 0.87 (0.86 for women and 0.87 for men) and area under the precision-recall curve = 0.50 (0.55 for women and 0.45 for men). Based on SHAP values, the most important baseline predictors for any STB at the 4-month follow-up were the presence of passive suicidal ideation, the number of days in the past 30 days with passive or active suicidal ideation, the number of days in the past 30 days with binge eating episodes, the number of panic attacks (women only) and the frequency of intrusive thoughts (men only).

Conclusions: Machine learning-based prediction models for STBs in HCWs during the COVID-19 pandemic trained on web-based survey data present high discrimination and classification capacity. Future clinical implementations of this model could enable the early detection of HCWs at the highest risk for developing adverse mental health outcomes.

Study registration: NCT04556565.

确定西班牙COVID-19大流行期间活跃医护人员自杀念头和行为的最重要预测因素:机器学习方法。
目的:在COVID-19大流行期间进行的研究发现,卫生保健工作者(HCWs)中自杀念头和行为(STBs)的发生率很高。目前的研究旨在(1)利用来自西班牙大型前瞻性医疗保健工作者队列的数据,为未来的stb开发基于机器学习的预测模型;(2)确定对模型预测准确性贡献最大的变量。方法:这是一项针对COVID-19大流行期间活跃的西班牙医护人员的前瞻性多中心队列研究。共有8996名医护人员参加了基于网络的基线调查(2020年5月至7月),4809名医护人员参加了为期4个月的随访调查。基线调查共得出219个预测变量。结果变量为4个月随访时的任何STB。变量选择使用L1正则化线性支持向量分类器(SVC)完成。建立了一个具有5倍交叉验证的随机森林模型,其中对合成少数过采样技术(SMOTE)和多数类平衡技术的欠采样进行了测试。通过受试者工作特征曲线下面积和精确召回率曲线下面积对模型进行评价。Shapley加性解释值(SHAP值)用于评估各变量对未来stb预测的总体贡献。结果按性别分别得出。结果:随访4个月时,医务人员性传播感染感染率为7.9%(女性为7.8%,男性为8.2%)。采用L1正则化线性SVC选择34个变量。在没有数据平衡技术的情况下获得了最好的结果:AUROC = 0.87(女性为0.86,男性为0.87),precision-recall曲线下面积= 0.50(女性为0.55,男性为0.45)。基于SHAP值,在4个月的随访中,任何STB最重要的基线预测指标是被动自杀意念的存在、过去30天内被动或主动自杀意念的天数、过去30天内暴食发作的天数、恐慌发作的次数(仅限女性)和侵入性想法的频率(仅限男性)。结论:基于网络调查数据训练的基于机器学习的2019冠状病毒病大流行期间卫生保健中心性病预测模型具有较高的判别和分类能力。该模型的未来临床应用可以早期发现发展不良心理健康结果风险最高的卫生保健工作者。研究注册:NCT04556565。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.80
自引率
1.20%
发文量
121
审稿时长
>12 weeks
期刊介绍: Epidemiology and Psychiatric Sciences is a prestigious international, peer-reviewed journal that has been publishing in Open Access format since 2020. Formerly known as Epidemiologia e Psichiatria Sociale and established in 1992 by Michele Tansella, the journal prioritizes highly relevant and innovative research articles and systematic reviews in the areas of public mental health and policy, mental health services and system research, as well as epidemiological and social psychiatry. Join us in advancing knowledge and understanding in these critical fields.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信