Vijeeth Guggilla, Mengjia Kang, Melissa J Bak, Steven D Tran, Anna Pawlowski, Prasanth Nannapaneni, Luke V Rasmussen, Daniel Schneider, Helen K Donnelly, Ankit Agrawal, David Liebovitz, Alexander V Misharin, G R Scott Budinger, Richard G Wunderink, Theresa L Walunas, Catherine A Gao
{"title":"大型语言模型准确识别重症监护病房患者的免疫抑制。","authors":"Vijeeth Guggilla, Mengjia Kang, Melissa J Bak, Steven D Tran, Anna Pawlowski, Prasanth Nannapaneni, Luke V Rasmussen, Daniel Schneider, Helen K Donnelly, Ankit Agrawal, David Liebovitz, Alexander V Misharin, G R Scott Budinger, Richard G Wunderink, Theresa L Walunas, Catherine A Gao","doi":"10.1093/jamia/ocaf141","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Rule-based structured data algorithms and natural language processing (NLP) approaches applied to unstructured clinical notes have limited accuracy and poor generalizability for identifying immunosuppression. Large language models (LLMs) may effectively identify patients with heterogenous types of immunosuppression from unstructured clinical notes. We compared the performance of LLMs applied to unstructured notes for identifying patients with immunosuppressive conditions or immunosuppressive medication use against 2 baselines: (1) structured data algorithms using diagnosis codes and medication orders and (2) NLP approaches applied to unstructured notes.</p><p><strong>Materials and methods: </strong>We used hospital admission notes from a primary cohort of 827 intensive care unit (ICU) patients at Northwestern Memorial Hospital and a validation cohort of 200 ICU patients at Beth Israel Deaconess Medical Center, along with diagnosis codes and medication orders from the primary cohort. We evaluated the performance of structured data algorithms, NLP approaches, and LLMs in identifying 7 immunosuppressive conditions and 6 immunosuppressive medications.</p><p><strong>Results: </strong>In the primary cohort, structured data algorithms achieved peak F1 scores ranging from 0.30 to 0.97 for identifying immunosuppressive conditions and medications. NLP approaches achieved peak F1 scores ranging from 0 to 1. GPT-4o outperformed or matched structured data algorithms and NLP approaches across all conditions and medications, with F1 scores ranging from 0.51 to 1. GPT-4o also performed impressively in our validation cohort (F1 = 1 for 8/13 variables).</p><p><strong>Discussion: </strong>LLMs, particularly GPT-4o, outperformed structured data algorithms and NLP approaches in identifying immunosuppressive conditions and medications with robust external validation.</p><p><strong>Conclusion: </strong>LLMs can be applied for improved cohort identification for research purposes.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490808/pdf/","citationCount":"0","resultStr":"{\"title\":\"Large language models accurately identify immunosuppression in intensive care unit patients.\",\"authors\":\"Vijeeth Guggilla, Mengjia Kang, Melissa J Bak, Steven D Tran, Anna Pawlowski, Prasanth Nannapaneni, Luke V Rasmussen, Daniel Schneider, Helen K Donnelly, Ankit Agrawal, David Liebovitz, Alexander V Misharin, G R Scott Budinger, Richard G Wunderink, Theresa L Walunas, Catherine A Gao\",\"doi\":\"10.1093/jamia/ocaf141\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>Rule-based structured data algorithms and natural language processing (NLP) approaches applied to unstructured clinical notes have limited accuracy and poor generalizability for identifying immunosuppression. Large language models (LLMs) may effectively identify patients with heterogenous types of immunosuppression from unstructured clinical notes. We compared the performance of LLMs applied to unstructured notes for identifying patients with immunosuppressive conditions or immunosuppressive medication use against 2 baselines: (1) structured data algorithms using diagnosis codes and medication orders and (2) NLP approaches applied to unstructured notes.</p><p><strong>Materials and methods: </strong>We used hospital admission notes from a primary cohort of 827 intensive care unit (ICU) patients at Northwestern Memorial Hospital and a validation cohort of 200 ICU patients at Beth Israel Deaconess Medical Center, along with diagnosis codes and medication orders from the primary cohort. We evaluated the performance of structured data algorithms, NLP approaches, and LLMs in identifying 7 immunosuppressive conditions and 6 immunosuppressive medications.</p><p><strong>Results: </strong>In the primary cohort, structured data algorithms achieved peak F1 scores ranging from 0.30 to 0.97 for identifying immunosuppressive conditions and medications. NLP approaches achieved peak F1 scores ranging from 0 to 1. GPT-4o outperformed or matched structured data algorithms and NLP approaches across all conditions and medications, with F1 scores ranging from 0.51 to 1. GPT-4o also performed impressively in our validation cohort (F1 = 1 for 8/13 variables).</p><p><strong>Discussion: </strong>LLMs, particularly GPT-4o, outperformed structured data algorithms and NLP approaches in identifying immunosuppressive conditions and medications with robust external validation.</p><p><strong>Conclusion: </strong>LLMs can be applied for improved cohort identification for research purposes.</p>\",\"PeriodicalId\":50016,\"journal\":{\"name\":\"Journal of the American Medical Informatics Association\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12490808/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the American Medical Informatics Association\",\"FirstCategoryId\":\"91\",\"ListUrlMain\":\"https://doi.org/10.1093/jamia/ocaf141\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocaf141","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Large language models accurately identify immunosuppression in intensive care unit patients.
Objective: Rule-based structured data algorithms and natural language processing (NLP) approaches applied to unstructured clinical notes have limited accuracy and poor generalizability for identifying immunosuppression. Large language models (LLMs) may effectively identify patients with heterogenous types of immunosuppression from unstructured clinical notes. We compared the performance of LLMs applied to unstructured notes for identifying patients with immunosuppressive conditions or immunosuppressive medication use against 2 baselines: (1) structured data algorithms using diagnosis codes and medication orders and (2) NLP approaches applied to unstructured notes.
Materials and methods: We used hospital admission notes from a primary cohort of 827 intensive care unit (ICU) patients at Northwestern Memorial Hospital and a validation cohort of 200 ICU patients at Beth Israel Deaconess Medical Center, along with diagnosis codes and medication orders from the primary cohort. We evaluated the performance of structured data algorithms, NLP approaches, and LLMs in identifying 7 immunosuppressive conditions and 6 immunosuppressive medications.
Results: In the primary cohort, structured data algorithms achieved peak F1 scores ranging from 0.30 to 0.97 for identifying immunosuppressive conditions and medications. NLP approaches achieved peak F1 scores ranging from 0 to 1. GPT-4o outperformed or matched structured data algorithms and NLP approaches across all conditions and medications, with F1 scores ranging from 0.51 to 1. GPT-4o also performed impressively in our validation cohort (F1 = 1 for 8/13 variables).
Discussion: LLMs, particularly GPT-4o, outperformed structured data algorithms and NLP approaches in identifying immunosuppressive conditions and medications with robust external validation.
Conclusion: LLMs can be applied for improved cohort identification for research purposes.
期刊介绍:
JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.