{"title":"应用大型语言模型,利用临床叙事笔记对自杀风险进行分层","authors":"Thomas H. McCoy , Roy H. Perlis","doi":"10.1016/j.xjmad.2025.100109","DOIUrl":null,"url":null,"abstract":"<div><div>We investigated whether large language models can stratify risk for suicide following hospital discharge. We drew on a very large cohort of 458,053 adults discharged from two academic medical centers between January 4, 2005 and January 2, 2014, linked to administrative vital status data. From this sample, each of the 1995 individuals who died by suicide or accident was matched with 5 control individuals on the basis of age, sex, race and ethnicity, admitting hospital, insurance, comorbidity index, and discharge year. We applied a HIPAA-compliant large language model (gpt-4–1106-preview) to estimate risk for suicide based on narrative discharge summaries. In the resulting cohort (n = 11,970), median age was 57 (IQR 44 –76); 4536 (38 %) were women; 348 (3 %) had a primary psychiatric admission diagnosis. For the model-predicted risk, time to 90 % survival was 1588 days (IQR 1374–1905) in the lowest-risk quartile, 1432 (IQR 1157–1651) in the 2nd quartile, 661 (IQR 538–820) in the 3rd quartile, and 302 (IQR 260–362) in the top quartile (p < .001). In Fine and Gray competing risk regression, predicted hazard was significantly associated with observed risk (unadjusted HR 7.66 [95 % CI 6.40–9.27]; adjusted for sociodemographic features and utilization, HR 8.86 (7.00–11.2)). Estimated risks were significantly greater scores among individuals who were Black or Hispanic (p < .005 for each, versus white individuals). Overall, a large language model (LLM) was able to stratify risk for suicide and accidental death among individuals discharged from academic medical centers beyond that afforded by simple sociodemographic and clinical features medical centers.</div></div>","PeriodicalId":73841,"journal":{"name":"Journal of mood and anxiety disorders","volume":"10 ","pages":"Article 100109"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Applying large language models to stratify suicide risk using narrative clinical notes\",\"authors\":\"Thomas H. McCoy , Roy H. Perlis\",\"doi\":\"10.1016/j.xjmad.2025.100109\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>We investigated whether large language models can stratify risk for suicide following hospital discharge. We drew on a very large cohort of 458,053 adults discharged from two academic medical centers between January 4, 2005 and January 2, 2014, linked to administrative vital status data. From this sample, each of the 1995 individuals who died by suicide or accident was matched with 5 control individuals on the basis of age, sex, race and ethnicity, admitting hospital, insurance, comorbidity index, and discharge year. We applied a HIPAA-compliant large language model (gpt-4–1106-preview) to estimate risk for suicide based on narrative discharge summaries. In the resulting cohort (n = 11,970), median age was 57 (IQR 44 –76); 4536 (38 %) were women; 348 (3 %) had a primary psychiatric admission diagnosis. For the model-predicted risk, time to 90 % survival was 1588 days (IQR 1374–1905) in the lowest-risk quartile, 1432 (IQR 1157–1651) in the 2nd quartile, 661 (IQR 538–820) in the 3rd quartile, and 302 (IQR 260–362) in the top quartile (p < .001). In Fine and Gray competing risk regression, predicted hazard was significantly associated with observed risk (unadjusted HR 7.66 [95 % CI 6.40–9.27]; adjusted for sociodemographic features and utilization, HR 8.86 (7.00–11.2)). Estimated risks were significantly greater scores among individuals who were Black or Hispanic (p < .005 for each, versus white individuals). Overall, a large language model (LLM) was able to stratify risk for suicide and accidental death among individuals discharged from academic medical centers beyond that afforded by simple sociodemographic and clinical features medical centers.</div></div>\",\"PeriodicalId\":73841,\"journal\":{\"name\":\"Journal of mood and anxiety disorders\",\"volume\":\"10 \",\"pages\":\"Article 100109\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-01-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of mood and anxiety disorders\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2950004425000069\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of mood and anxiety disorders","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2950004425000069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Applying large language models to stratify suicide risk using narrative clinical notes
We investigated whether large language models can stratify risk for suicide following hospital discharge. We drew on a very large cohort of 458,053 adults discharged from two academic medical centers between January 4, 2005 and January 2, 2014, linked to administrative vital status data. From this sample, each of the 1995 individuals who died by suicide or accident was matched with 5 control individuals on the basis of age, sex, race and ethnicity, admitting hospital, insurance, comorbidity index, and discharge year. We applied a HIPAA-compliant large language model (gpt-4–1106-preview) to estimate risk for suicide based on narrative discharge summaries. In the resulting cohort (n = 11,970), median age was 57 (IQR 44 –76); 4536 (38 %) were women; 348 (3 %) had a primary psychiatric admission diagnosis. For the model-predicted risk, time to 90 % survival was 1588 days (IQR 1374–1905) in the lowest-risk quartile, 1432 (IQR 1157–1651) in the 2nd quartile, 661 (IQR 538–820) in the 3rd quartile, and 302 (IQR 260–362) in the top quartile (p < .001). In Fine and Gray competing risk regression, predicted hazard was significantly associated with observed risk (unadjusted HR 7.66 [95 % CI 6.40–9.27]; adjusted for sociodemographic features and utilization, HR 8.86 (7.00–11.2)). Estimated risks were significantly greater scores among individuals who were Black or Hispanic (p < .005 for each, versus white individuals). Overall, a large language model (LLM) was able to stratify risk for suicide and accidental death among individuals discharged from academic medical centers beyond that afforded by simple sociodemographic and clinical features medical centers.