Vijeeth Guggilla, Mengjia Kang, Melissa J Bak, Steven D Tran, Anna Pawlowski, Prasanth Nannapaneni, Luke V Rasmussen, Daniel Schneider, Helen Donnelly, Ankit Agrawal, David Liebovitz, Alexander V Misharin, Gr Scott Budinger, Richard G Wunderink, Theresa L Walunas, Catherine A Gao
{"title":"Large language models outperform traditional structured data-based approaches in identifying immunosuppressed patients.","authors":"Vijeeth Guggilla, Mengjia Kang, Melissa J Bak, Steven D Tran, Anna Pawlowski, Prasanth Nannapaneni, Luke V Rasmussen, Daniel Schneider, Helen Donnelly, Ankit Agrawal, David Liebovitz, Alexander V Misharin, Gr Scott Budinger, Richard G Wunderink, Theresa L Walunas, Catherine A Gao","doi":"10.1101/2025.01.16.25320564","DOIUrl":null,"url":null,"abstract":"<p><p>Identifying immunosuppressed patients using structured data can be challenging. Large language models effectively extract structured concepts from unstructured clinical text. Here we show that GPT-4o outperforms traditional approaches in identifying immunosuppressive conditions and medication use by processing hospital admission notes. We also demonstrate the extensibility of our approach in an external dataset. Cost-effective models like GPT-4o mini and Llama 3.1 also perform well, but not as well as GPT-4o.</p>","PeriodicalId":94281,"journal":{"name":"medRxiv : the preprint server for health sciences","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11759841/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv : the preprint server for health sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2025.01.16.25320564","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Identifying immunosuppressed patients using structured data can be challenging. Large language models effectively extract structured concepts from unstructured clinical text. Here we show that GPT-4o outperforms traditional approaches in identifying immunosuppressive conditions and medication use by processing hospital admission notes. We also demonstrate the extensibility of our approach in an external dataset. Cost-effective models like GPT-4o mini and Llama 3.1 also perform well, but not as well as GPT-4o.