Enhancing Suicide Risk Prediction Models with Temporal Clinical Note Features.

IF 2.1 2区 医学 Q4 MEDICAL INFORMATICS
Kevin Krause,Sharon Davis,Zhijun Yin,Katherine Schafer,Trent Rosenbloom,Colin Walsh
{"title":"Enhancing Suicide Risk Prediction Models with Temporal Clinical Note Features.","authors":"Kevin Krause,Sharon Davis,Zhijun Yin,Katherine Schafer,Trent Rosenbloom,Colin Walsh","doi":"10.1055/a-2411-5796","DOIUrl":null,"url":null,"abstract":"OBJECTIVE\r\nThe objective of this study was to investigate the impact of enhancing a structured-data-based suicide attempt risk prediction model with temporal Concept Unique Identifiers (CUIs) derived from clinical notes. We aimed to examine how different temporal schemes, model types, and prediction ranges influenced the model's predictive performance. This research sought to improve our understanding of how the integration of temporal information and clinical variable transformation could enhance model predictions.\r\n\r\nMATERIALS AND METHODS\r\nWe identified modeling targets using diagnostic codes for suicide attempts within 30, 90, or 365 days following a temporally grouped visit cluster. Structured data included medications, diagnoses, procedures, and demographics, while unstructured data consisted of terms extracted with regular expressions from clinical notes. We compared models trained only on structured data (controls) to hybrid models trained on both structured and unstructured data. We used two temporalization schemes for clinical notes: fixed 90-day windows and flexible epochs. We trained and assessed random forests and hybrid LSTM neural networks using AUPRC and AUROC, with additional evaluation of sensitivity and PPV at 95% specificity.\r\n\r\nRESULTS\r\nThe training set included 2,364,183 visit clusters with 2,009 30-day suicide attempts, and the testing set contained 471,936 visit clusters with 480 suicide attempts. Models trained with temporal CUIs outperformed those trained with only structured data. The window-temporalized LSTM model achieved the highest AUPRC (0.056 ± 0.013) for the 30-day prediction range. Hybrid models generally showed better performance compared to controls across most metrics.\r\n\r\nDISCUSSION AND CONCLUSION\r\nThis study demonstrated that incorporating EHR-derived clinical note features enhanced suicide attempt risk prediction models, particularly with window-temporalized LSTM models. Our results underscored the critical value of unstructured data in suicidality prediction, aligning with previous findings. Future research should focus on integrating more sophisticated methods to continue improving prediction accuracy, which will enhance the effectiveness of future intervention.","PeriodicalId":48956,"journal":{"name":"Applied Clinical Informatics","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Clinical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1055/a-2411-5796","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0

Abstract

OBJECTIVE The objective of this study was to investigate the impact of enhancing a structured-data-based suicide attempt risk prediction model with temporal Concept Unique Identifiers (CUIs) derived from clinical notes. We aimed to examine how different temporal schemes, model types, and prediction ranges influenced the model's predictive performance. This research sought to improve our understanding of how the integration of temporal information and clinical variable transformation could enhance model predictions. MATERIALS AND METHODS We identified modeling targets using diagnostic codes for suicide attempts within 30, 90, or 365 days following a temporally grouped visit cluster. Structured data included medications, diagnoses, procedures, and demographics, while unstructured data consisted of terms extracted with regular expressions from clinical notes. We compared models trained only on structured data (controls) to hybrid models trained on both structured and unstructured data. We used two temporalization schemes for clinical notes: fixed 90-day windows and flexible epochs. We trained and assessed random forests and hybrid LSTM neural networks using AUPRC and AUROC, with additional evaluation of sensitivity and PPV at 95% specificity. RESULTS The training set included 2,364,183 visit clusters with 2,009 30-day suicide attempts, and the testing set contained 471,936 visit clusters with 480 suicide attempts. Models trained with temporal CUIs outperformed those trained with only structured data. The window-temporalized LSTM model achieved the highest AUPRC (0.056 ± 0.013) for the 30-day prediction range. Hybrid models generally showed better performance compared to controls across most metrics. DISCUSSION AND CONCLUSION This study demonstrated that incorporating EHR-derived clinical note features enhanced suicide attempt risk prediction models, particularly with window-temporalized LSTM models. Our results underscored the critical value of unstructured data in suicidality prediction, aligning with previous findings. Future research should focus on integrating more sophisticated methods to continue improving prediction accuracy, which will enhance the effectiveness of future intervention.
利用时态临床笔记特征增强自杀风险预测模型。
目的:本研究的目的是探讨利用来自临床笔记的时间概念唯一标识符 (CUI) 增强基于结构化数据的自杀未遂风险预测模型的影响。我们旨在研究不同的时间方案、模型类型和预测范围对模型预测性能的影响。这项研究旨在加深我们对整合时间信息和临床变量转换如何提高模型预测效果的理解。材料与方法 我们使用诊断代码确定了建模目标,这些代码是在按时间分组的就诊群组之后 30、90 或 365 天内的自杀未遂行为。结构化数据包括药物、诊断、手术和人口统计数据,而非结构化数据包括从临床笔记中用正则表达式提取的术语。我们将仅在结构化数据(对照组)上训练的模型与在结构化数据和非结构化数据上训练的混合模型进行了比较。我们对临床笔记采用了两种时间化方案:固定的 90 天窗口和灵活的历时。我们使用 AUPRC 和 AUROC 对随机森林和混合 LSTM 神经网络进行了训练和评估,并在 95% 的特异性水平上对灵敏度和 PPV 进行了额外评估。结果训练集包括 2,364,183 个就诊集群,其中有 2,009 例 30 天自杀未遂,测试集包括 471,936 个就诊集群,其中有 480 例自杀未遂。使用时间 CUI 训练的模型优于仅使用结构化数据训练的模型。窗口时间化 LSTM 模型在 30 天预测范围内的 AUPRC 最高(0.056 ± 0.013)。与对照组相比,混合模型在大多数指标上都表现出更好的性能。 本研究表明,结合 EHR 衍生的临床笔记特征增强了自杀未遂风险预测模型,尤其是窗时化 LSTM 模型。我们的研究结果强调了非结构化数据在自杀倾向预测中的重要价值,这与之前的研究结果一致。未来的研究应侧重于整合更复杂的方法,以继续提高预测的准确性,从而增强未来干预的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Applied Clinical Informatics
Applied Clinical Informatics MEDICAL INFORMATICS-
CiteScore
4.60
自引率
24.10%
发文量
132
期刊介绍: ACI is the third Schattauer journal dealing with biomedical and health informatics. It perfectly complements our other journals Öffnet internen Link im aktuellen FensterMethods of Information in Medicine and the Öffnet internen Link im aktuellen FensterYearbook of Medical Informatics. The Yearbook of Medical Informatics being the “Milestone” or state-of-the-art journal and Methods of Information in Medicine being the “Science and Research” journal of IMIA, ACI intends to be the “Practical” journal of IMIA.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信