Applying large language models to stratify suicide risk using narrative clinical notes

Thomas H. McCoy , Roy H. Perlis
{"title":"Applying large language models to stratify suicide risk using narrative clinical notes","authors":"Thomas H. McCoy ,&nbsp;Roy H. Perlis","doi":"10.1016/j.xjmad.2025.100109","DOIUrl":null,"url":null,"abstract":"<div><div>We investigated whether large language models can stratify risk for suicide following hospital discharge. We drew on a very large cohort of 458,053 adults discharged from two academic medical centers between January 4, 2005 and January 2, 2014, linked to administrative vital status data. From this sample, each of the 1995 individuals who died by suicide or accident was matched with 5 control individuals on the basis of age, sex, race and ethnicity, admitting hospital, insurance, comorbidity index, and discharge year. We applied a HIPAA-compliant large language model (gpt-4–1106-preview) to estimate risk for suicide based on narrative discharge summaries. In the resulting cohort (n = 11,970), median age was 57 (IQR 44 –76); 4536 (38 %) were women; 348 (3 %) had a primary psychiatric admission diagnosis. For the model-predicted risk, time to 90 % survival was 1588 days (IQR 1374–1905) in the lowest-risk quartile, 1432 (IQR 1157–1651) in the 2nd quartile, 661 (IQR 538–820) in the 3rd quartile, and 302 (IQR 260–362) in the top quartile (p &lt; .001). In Fine and Gray competing risk regression, predicted hazard was significantly associated with observed risk (unadjusted HR 7.66 [95 % CI 6.40–9.27]; adjusted for sociodemographic features and utilization, HR 8.86 (7.00–11.2)). Estimated risks were significantly greater scores among individuals who were Black or Hispanic (p &lt; .005 for each, versus white individuals). Overall, a large language model (LLM) was able to stratify risk for suicide and accidental death among individuals discharged from academic medical centers beyond that afforded by simple sociodemographic and clinical features medical centers.</div></div>","PeriodicalId":73841,"journal":{"name":"Journal of mood and anxiety disorders","volume":"10 ","pages":"Article 100109"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of mood and anxiety disorders","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2950004425000069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We investigated whether large language models can stratify risk for suicide following hospital discharge. We drew on a very large cohort of 458,053 adults discharged from two academic medical centers between January 4, 2005 and January 2, 2014, linked to administrative vital status data. From this sample, each of the 1995 individuals who died by suicide or accident was matched with 5 control individuals on the basis of age, sex, race and ethnicity, admitting hospital, insurance, comorbidity index, and discharge year. We applied a HIPAA-compliant large language model (gpt-4–1106-preview) to estimate risk for suicide based on narrative discharge summaries. In the resulting cohort (n = 11,970), median age was 57 (IQR 44 –76); 4536 (38 %) were women; 348 (3 %) had a primary psychiatric admission diagnosis. For the model-predicted risk, time to 90 % survival was 1588 days (IQR 1374–1905) in the lowest-risk quartile, 1432 (IQR 1157–1651) in the 2nd quartile, 661 (IQR 538–820) in the 3rd quartile, and 302 (IQR 260–362) in the top quartile (p < .001). In Fine and Gray competing risk regression, predicted hazard was significantly associated with observed risk (unadjusted HR 7.66 [95 % CI 6.40–9.27]; adjusted for sociodemographic features and utilization, HR 8.86 (7.00–11.2)). Estimated risks were significantly greater scores among individuals who were Black or Hispanic (p < .005 for each, versus white individuals). Overall, a large language model (LLM) was able to stratify risk for suicide and accidental death among individuals discharged from academic medical centers beyond that afforded by simple sociodemographic and clinical features medical centers.
应用大型语言模型,利用临床叙事笔记对自杀风险进行分层
我们调查了大型语言模型是否可以对出院后的自杀风险进行分层。我们在2005年1月4日至2014年1月2日期间从两个学术医疗中心出院的458,053名成年人中抽取了一个非常大的队列,与行政重要状态数据相关。从该样本中,根据年龄、性别、种族和民族、入院情况、保险情况、合并症指数和出院年份,将1995名自杀或意外死亡的个体与5名对照个体进行匹配。我们应用了一个符合hipaa的大型语言模型(gpt-4-1106-preview)来估计基于叙述性出院摘要的自杀风险。在结果队列中(n = 11,970),中位年龄为57岁(IQR 44 -76);4536例(38% %)为女性;348例(3 %)有初级精神科入院诊断。对于模型预测的风险,最低风险四分位数的90 %生存时间为1588天(IQR 1374-1905),第二四分位数为1432天(IQR 1157-1651),第三四分位数为661天(IQR 538-820),最高四分位数为302天(IQR 260-362) (p <; .001)。在Fine和Gray竞争风险回归中,预测风险与观察风险显著相关(未调整HR 7.66[95 % CI 6.40-9.27];调整了社会人口特征和利用率,HR 8.86(7.00-11.2))。黑人或西班牙裔个体的估计风险得分明显更高(p <; )。与白人相比,每人0.005人)。总的来说,一个大型语言模型(LLM)能够对学术医疗中心出院的个体的自杀和意外死亡风险进行分层,而不仅仅是简单的社会人口统计学和临床特征医疗中心。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of mood and anxiety disorders
Journal of mood and anxiety disorders Applied Psychology, Experimental and Cognitive Psychology, Clinical Psychology, Psychiatry and Mental Health, Psychology (General), Behavioral Neuroscience
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信