Xinmeng Zhang, Chao Yan, Yuyang Yang, Zhuohang Li, Yubo Feng, Bradley A Malin, You Chen
{"title":"Optimizing Large Language Models for Discharge Prediction: Best Practices in Leveraging Electronic Health Record Audit Logs.","authors":"Xinmeng Zhang, Chao Yan, Yuyang Yang, Zhuohang Li, Yubo Feng, Bradley A Malin, You Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Electronic Health Record (EHR) audit log data are increasingly utilized for clinical tasks, from workflow modeling to predictive analyses of discharge events, adverse kidney outcomes, and hospital readmissions. These data encapsulate user-EHR interactions, reflecting both healthcare professionals' behavior and patients' health statuses. To harness this temporal information effectively, this study explores the application of Large Language Models (LLMs) in leveraging audit log data for clinical prediction tasks, specifically focusing on discharge predictions. Utilizing a year's worth of EHR data from Vanderbilt University Medical Center, we fine-tuned LLMs with randomly selected 10,000 training examples. Our findings reveal that LLaMA-2 70B, with an AUROC of 0.80 [0.77-0.82], outperforms both GPT-4 128K in a zero-shot, with an AUROC of 0.68 [0.65-0.71], and DeBERTa, with an AUROC of 0.78 [0.75-0.82]. Among various serialization methods, the first-occurrence approach-wherein only the initial appearance of each event in a sequence is retained-shows superior performance. Furthermore, for the fine-tuned LLaMA-2 70B, logit outputs yield a higher AUROC of 0.80 [0.77-0.82] compared to text outputs, with an AUROC of 0.69 [0.67-0.72]. This study underscores the potential of fine-tuned LLMs, particularly when combined with strategic sequence serialization, in advancing clinical prediction tasks.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"1323-1331"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099422/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Suraj Sood, Jawad S Shah, Saeed Alqarn, Yugyung Lee
{"title":"PathSAM: Enhancing Oral Cancer Detection with Advanced Segmentation and Explainability.","authors":"Suraj Sood, Jawad S Shah, Saeed Alqarn, Yugyung Lee","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Building on the success of the Segment Anything Model (SAM) in image segmentation, \"PathSAM: SAM for Pathological Images in Oral Cancer Detection\" addresses the unique challenges associated with diagnosing oral cancer. Although SAM is versatile, its application to pathological images is hindered by its inherent complexity and variability. PathSAM advances beyond traditional deep-learning methods by delivering superior accuracy and detail in segmenting critical datasets like ORCA and OCDC, as demonstrated through both quantitative and qualitative evaluations. The integration of Large Language Models (LLMs) further enhances PathSAM by providing clear, interpretable segmentation results, facilitating accurate tumor identification, and improving communication between patients and healthcare providers. This innovation positions PathSAM as a valuable tool in medical diagnostics.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"1069-1078"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099372/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hanna Kiani, Sohaib Hassan, Julian Z Genkins, Jasmine Bilir, Julia Kadie, Tran Le, Jo-Anne Suffoletto, Jonathan H Chen
{"title":"Improving Emergency Department Visit Risk Prediction: Exploring the Operational Utility of Applied Patient Portal Messages.","authors":"Hanna Kiani, Sohaib Hassan, Julian Z Genkins, Jasmine Bilir, Julia Kadie, Tran Le, Jo-Anne Suffoletto, Jonathan H Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Patient portal messages represent a unique source of clinical data due to how they represent the voice of the patient, provide a glimpse into care delivery between episodic synchronous appointments, and capture variations in patient behavior and health literacy. There is little understanding of how to best apply modern natural language processing (NLP) approaches, such as large, pre-trained language models (LLMs), to patient messages. In this study, we aim to explore different approaches in incorporating patient messages into an existing Emergency Departments (ED) visit risk prediction model currently deployed at Stanford Health Care. With the addition of patient message frequencies to the baseline we were able to achieve an improved AUC of .77 and a jump in the F1 score. In future work, we aim to build upon these findings and further test combination models to incorporate features around patient message content, in addition to message frequencies.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"610-619"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099376/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Novel Sentence Transformer-based Natural Language Processing Approach for Schema Mapping of Electronic Health Records to the OMOP Common Data Model.","authors":"Xinyu Zhou, Lovedeep Singh Dhingra, Arya Aminorroaya, Philip Adejumo, Rohan Khera","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Mapping electronic health records (EHR) data to common data models (CDMs) enables the standardization of clinical records, enhancing interoperability and enabling large-scale, multi-centered clinical investigations. Using 2 large publicly available datasets, we developed transformer-based natural language processing models to map medication-related concepts from the EHR at a large and diverse healthcare system to standard concepts in OMOP CDM. We validated the model outputs against standard concepts manually mapped by clinicians. Our best model reached out-of-box accuracies of 96.5% in mapping the 200 most common drugs and 83.0% in mapping 200 random drugs in the EHR. For these tasks, this model outperformed a state-of-the-art large language model (SFR-Embedding-Mistral, 89.5% and 66.5% in accuracy for the two tasks), a widely used software for schema mapping (Usagi, 90.0% and 70.0% in accuracy), and direct string match (7.5% and 7.5% accuracy). Transformer-based deep learning models outperform existing approaches in the standardized mapping of EHR elements and can facilitate an end-to-end automated EHR transformation pipeline.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"1332-1339"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099400/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Markus Kreuzthaler, Bastian Pfeifer, Stefan Schulz
{"title":"Secondary Use of Clinical Problem List Descriptions for Bi-Encoder Based ICD-10 Classification.","authors":"Markus Kreuzthaler, Bastian Pfeifer, Stefan Schulz","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Annotated language resources are essential for supervised machine learning methods. In the clinical domain, such data sets can boost use-case specific natural language processing services. In this work, we have analyzed a clinical problem list table consisting of millions of ICD-10 codes assigned to short problem list descriptions in German. We have investigated whether the given data forms a valuable resource within a secondary use case scenario for coding support. Our proposed methodology exploits an embedding-based k-NN classifier, which was evaluated based on its coding performance, leveraging the multilingual BERT based language model SapBERT-UMLS in comparison with medBERT.de, which is specifically tailored to medical and clinical language resources in German. Our approach reached a weighted F1-measure of 0.87 using SapBERT-UMLS and an F1-measure of 0.86 for medBERT.de. The approach revealed promising coding results when reusing annotated language resources out of clinical routine documentation.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"620-627"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099355/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Global Relevance of Online Health Information Sources: A Case Study of Experiences and Perceptions of Nigerians.","authors":"Ommo Clark, Karuna P Joshi, Tera L Reynolds","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Online health information sources (OHIS) offer potential for improving access to health information especially in areas with limited healthcare infrastructure. However, OHIS predominantly originates from Western societies potentially ignoring the specific needs and cultural contexts of diverse populations. There is limited research on the global suitability of OHIS content. This study explores the global relevance of OHIS for diverse populations through a case study examining user experiences of Nigerians living in multiple countries. Findings reveal OHIS usage patterns are influenced by the country of residence and local health services availability. The study highlights the need for culturally inclusive OHIS content to ensure equitable health information access globally. Ultimately, for OHIS to serve a global audience effectively, there needs to be reliable information sources that acknowledge and cater to different users' cultural backgrounds, including prevalent health issues, medical practices, beliefs, languages, and healthcare expectations.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"300-308"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099433/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Assessing the Seasonality of Lab Tests Among Patients with Alzheimer's Disease and Related Dementias in OneFlorida Data Trust.","authors":"Wenshan Han, Balu Bhasuran, Victorine Patricia Muse, Søren Brunak, Lifeng Lin, Karim Hanna, Yu Huang, Jiang Bian, Zhe He","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>About 1 in 9 older adults over 65 has Alzheimer's disease (AD), many of whom also have multiple other chronic conditions such as hypertension and diabetes, necessitating careful monitoring through laboratory tests. Understanding the patterns of laboratory tests in this population aids our understanding and management of these chronic conditions along with AD. In this study, we used an unimodal cosinor model to assess the seasonality of lab tests using electronic health record (EHR) data from 34,303 AD patients from the OneFlorida+ Clinical Research Consortium. We observed significant seasonal fluctuations-higher in winter in lab tests such as glucose, neutrophils per 100 white blood cells (WBC), and WBC. Notably, certain leukocyte types like eosinophils, lymphocytes, and monocytes are elevated during summer, likely reflecting seasonal respiratory diseases and allergens. Seasonality is more pronounced in older patients and varies by gender. Our findings suggest that recognizing these patterns and adjusting reference intervals for seasonality would allow healthcare providers to enhance diagnostic precision, tailor care, and potentially improve patient outcomes.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"483-492"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099435/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Feng Chen, Manas Satish Bedmutha, Ray-Yuan Chung, Janice Sabin, Wanda Pratt, Brian R Wood, Nadir Weibel, Andrea L Hartzler, Trevor Cohen
{"title":"Toward Automated Detection of Biased Social Signals from the Content of Clinical Conversations.","authors":"Feng Chen, Manas Satish Bedmutha, Ray-Yuan Chung, Janice Sabin, Wanda Pratt, Brian R Wood, Nadir Weibel, Andrea L Hartzler, Trevor Cohen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Implicit bias can impede patient-provider interactions and lead to inequities in care. Raising awareness is key to reducing such bias, but its manifestations in the social dynamics of patient-provider communication are difficult to detect. In this study, we used automated speech recognition (ASR) and natural language processing (NLP) to identify social signals in patient-provider interactions. We built an automated pipeline to predict social signals from audio recordings of 782 primary care visits that achieved 90.1% average accuracy across codes, and exhibited fairness in its predictions for white and non-white patients. Applying this pipeline, we identified statistically significant differences in provider communication behavior toward white versus non-white patients. In particular, providers expressed more patient-centered behaviors towards white patients including more warmth, engagement, and attentiveness. Our study underscores the potential of automated tools in identifying subtle communication signals that may be linked with bias and impact healthcare quality and equity.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"252-261"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099337/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xiaojin Li, Yan Huang, Licong Cui, Shiqiang Tao, Guo-Qiang Zhang
{"title":"Optimizing Medication Querying Using Ontology-Driven Approach with OMOP: with an application to a large-scale COVID-19 EHR dataset.","authors":"Xiaojin Li, Yan Huang, Licong Cui, Shiqiang Tao, Guo-Qiang Zhang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Efficient querying for medication information in Electronic Health Record (EHR) datasets is crucial for effective patient care and clinical research. To address the complexity and data volume challenges involved in efficient medication information retrieval, we propose an ontology-driven medication query (ODMQ) optimization approach, leveraging the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). Integrating semantic ontology structures from the OMOP CDM can help enhance query accuracy and efficiency by broadening the scope of relevant medication terms like drug names, National Drug Codes, and generics, resulting in more comprehensive query outcomes than traditional methods. ODMQ significantly reduces manual search time and enhances query capabilities. We validate ODMQ's efficacy using real-world COVID-19 EHR data, demonstrating improved query performance. Through a comprehensive manual review, ODMQ ensures that expanded search terms are relevant to user inputs. It also includes an intuitive query interface and visualizes patient history for result validation and exploration.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"693-702"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099415/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144660","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gary E Weissman, Rebecca A Hubbard, Blanca E Himes, Kelly L Goodman-O'Leary, Michael O Harhay, Jennifer C Ginestra, Rachel Kohn, Andrew J Admon, Stephanie Parks Taylor, Scott D Halpern
{"title":"Sepsis Prediction Models are Trained on Labels that Diverge from Clinician-Recommended Treatment Times.","authors":"Gary E Weissman, Rebecca A Hubbard, Blanca E Himes, Kelly L Goodman-O'Leary, Michael O Harhay, Jennifer C Ginestra, Rachel Kohn, Andrew J Admon, Stephanie Parks Taylor, Scott D Halpern","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Many sepsis prediction models use the Sepsis-3 definition or its variants as a training label. However, among the few sepsis models ever deployed in practice, there is scant evidence that they offer clinically meaningful decision support at the bedside. As a potential mechanism to explain this limitation, we hypothesized that clinician-recommended treatment times for sepsis would diverge from onset time defined by Sepsis-3. We conducted an electronic survey that was completed by 153 clinicians at three large and geographically diverse medical centers using vignettes derived from eight real cases of sepsis. After reviewing these vignettes, participants suggested antibiotic treatment to start an average of 7.0 hours (95% confidence interval 5.3 to 8.8) before the Sepsis-3 definition onset. Thus, predicting Sepsis-3 onset as a treatment prompt could lead to inappropriate and delayed treatment recommendations. Building predictive decision support systems that identify outcomes aligned with bedside decisions would increase their clinical utility.</p>","PeriodicalId":72180,"journal":{"name":"AMIA ... Annual Symposium proceedings. AMIA Symposium","volume":"2024 ","pages":"1215-1224"},"PeriodicalIF":0.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12099352/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144144733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}