April S Liang, Juan M Banda, Thomas Savage, Abby Pandya, Rebecca Carey, Uchechukwu C Megwalu, Michael T Chang, Dev Dash, Conor K Corbin, Aditya Sharma, Rahul Thapa, Nikesh Kotecha, Nigam H Shah, Jennifer Y Lee, Jonathan H Chen
{"title":"Feasibility of Automated Precharting using GPT-4 in New Specialty Referrals.","authors":"April S Liang, Juan M Banda, Thomas Savage, Abby Pandya, Rebecca Carey, Uchechukwu C Megwalu, Michael T Chang, Dev Dash, Conor K Corbin, Aditya Sharma, Rahul Thapa, Nikesh Kotecha, Nigam H Shah, Jennifer Y Lee, Jonathan H Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This study evaluates the feasibility of using GPT-4 to automate precharting for specialty referrals, focusing on new patients referred to an otolaryngology clinic for nasal congestion. We describe the design decisions and strategies tested in creating this precharting utility, including methods for prompt design and token limit handling. Through iterative testing and building, our tool achieved 95.0% agreement with physician consensus in a small retrospective test sample. Results from a small prospective pilot showed favorable feedback of summaries in a real-world clinical setting, though there was a discrepancy between high intention to use the summary but lower perception of time savings. Our results demonstrate that automated pre-charting with accuracy and clinical relevance can be feasible with large language models such as GPT-4. Our design features can inform the development of vendor chart summarization solutions.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2025 ","pages":"312-321"},"PeriodicalIF":0.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150724/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144276846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
François Grolleau, Robert Tibshirani, Jonathan H Chen
{"title":"powerROC: An Interactive Web Tool for Sample Size Calculation in Assessing Models' Discriminative Abilities.","authors":"François Grolleau, Robert Tibshirani, Jonathan H Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Rigorous external validation is crucial for assessing the generalizability of prediction models, particularly by evaluating their discrimination (AUROC) on new data. This often involves comparing a new model's AUROC to that of an established reference model. However, many studies rely on arbitrary rules of thumb for sample size calculations, often resulting in underpowered analyses and unreliable conclusions. This paper reviews crucial concepts for accurate sample size determination in AUROC-based external validation studies, making the theory and practice more accessible to researchers and clinicians. We introduce powerROC, an open-source web tool designed to simplify these calculations, enabling both the evaluation of a single model and the comparison of two models. The tool offers guidance on selecting target precision levels and employs flexible approaches, leveraging either pilot data or user-defined probability distributions. We illustrate powerROC's utility through a case study on hospital mortality prediction using the MIMIC database.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2025 ","pages":"196-204"},"PeriodicalIF":0.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150715/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144276878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Travyse A Edwards, Tianhua Zhai, Kwangsik Nho, Andrew J Saykin, Qi Long, Li Shen
{"title":"Sex-Based Differences in the Association of Epigenetic Age Acceleration with Alzheimer's Disease Biomarkers and Cognitive Measures.","authors":"Travyse A Edwards, Tianhua Zhai, Kwangsik Nho, Andrew J Saykin, Qi Long, Li Shen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Alzheimer's Disease (AD) is a neurodegenerative disorder marked by cognitive and functional decline. Biological sex has been linked to differences in lifetime AD risk, AD-related neuropathology, and the rate of cognitive decline, although the underlying biological mechanisms driving these disparities remain unclear. Epigenetic Age Acceleration-a metric derived from epigenetic aging clocks-has been associated with numerous aging-related conditions such as AD. Although there is promise in using Epigenetic age acceleration as a biomarker for several aging related diseases, the underlying mechanism that aging clocks are actually predicting is not well understood. In this study, we used data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) to examine how sex influences the relationship between age acceleration and cognitive performance as well as brain volume. Our findings suggest that, although epigenetic age acceleration can predict changes in brain structure, these changes don't appear to be different across sexes. Future research should focus on validating these findings in an external cohort and exploring them longitudinally.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2025 ","pages":"141-148"},"PeriodicalIF":0.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150745/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144276884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alec B Chapman, Talia Panadero, Rachel Dalrymple, Alicia Cohen, Nipa Kamdar, Farhana Pethani, Andrea Kalvesmaki, Richard E Nelson, Jorie Butler
{"title":"Studying Veteran food insecurity longitudinally using electronic health record data and natural language processing.","authors":"Alec B Chapman, Talia Panadero, Rachel Dalrymple, Alicia Cohen, Nipa Kamdar, Farhana Pethani, Andrea Kalvesmaki, Richard E Nelson, Jorie Butler","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Food insecurity is an important social risk factor that is directly linked to patient health and well-being. The Department of Veterans Affairs (VA) aims to identify and resolve food insecurity through social and clinical interventions. However, evaluating the impact of such interventions is made challenging by the lack of follow-up data on Veteran food insecurity status. One potential solution is to leverage documentation of food insecurity in electronic health records (EHRs). In this paper, we developed and validated a natural language processing system to identify food insecurity status from clinical notes and applied it to study longitudinal trajectories of food insecurity among a large cohort of food insecure Veterans. Our analyses provide insight into the timing and persistence of Veteran food insecurity; in the future, our methods will be used to evaluate food insecurity interventions and evaluate VA policy.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2025 ","pages":"124-133"},"PeriodicalIF":0.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150752/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144276887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
François Grolleau, Ethan Goh, Stephen P Ma, Jonathan Masterson, Ted Ross, Arnold Milstein, Jonathan H Chen
{"title":"Systematic Exploration of Hospital Cost Variability: A Conformal Prediction-Based Outlier Detection Method for Electronic Health Records.","authors":"François Grolleau, Ethan Goh, Stephen P Ma, Jonathan Masterson, Ted Ross, Arnold Milstein, Jonathan H Chen","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Marked variability in inpatient hospitalization costs poses significant challenges to healthcare quality, resource allocation, and patient outcomes. Traditional methods like Diagnosis-Related Groups (DRGs) aid in cost management but lack practical solutions for enhancing hospital care value. We introduce a novel methodology for outlier detection in Electronic Health Records (EHRs) using Conformal Prediction. This approach identifies and prioritizes areas for optimizing high-value care processes. Unlike conventional predictive models that neglect uncertainty, our method employs Conformal Quantile Regression (CQR) to generate robust prediction intervals, offering a comprehensive view of cost variability. By integrating Conformal Prediction with machine learning models, healthcare professionals can more accurately pinpoint opportunities for quality and efficiency improvements. Our framework systematically evaluates unexplained hospital cost variations and generates interpretable hypotheses for refining clinical practices associated with atypical costs. This data-driven approach offers a systematic method to generate clinically sound hypotheses that may inform processes to enhance care quality and optimize resource utilization.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2025 ","pages":"187-195"},"PeriodicalIF":0.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150741/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144276889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ty J Skyles, Isaac J Freeman, Georgewilliam Kalibbala, David Davila-Garcia, Kendall Kiser, Silpa Raju, Adam Wilcox
{"title":"Exploring ChatGPT 3.5 for structured data extraction from oncological notes.","authors":"Ty J Skyles, Isaac J Freeman, Georgewilliam Kalibbala, David Davila-Garcia, Kendall Kiser, Silpa Raju, Adam Wilcox","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>In large-scale clinical informatics, there is a need to maximize the amount of usable data from electronic health records. With the adoption of large language models in medical research, there is potential to use them to extract structured data from unstructured clinical notes. We explored how ChatGPT could be used to improve data availability in cancer research. We assessed how GPT used clinical notes to answer six relevant clinical questions. Four prompt engineering strategies were used: zero-shot, zero-shot with context, few-shot, and few-shot with context. Few-shot prompting often decreased the accuracy of GPT outputs and context did not consistently improve accuracy. GPT extracted patients' Gleason scores and ages with an F1 score of 0.99 and it identified if patients received palliative care with and if patients were in pain with an F1 score of 0.86. Effective use of LLMs has potential to increase interoperability between healthcare and clinical research.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2025 ","pages":"518-526"},"PeriodicalIF":0.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150697/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144276874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Temporal Rule Mining for Enhanced Risk Pattern Extraction: A Case Study with Acute Kidney Injury.","authors":"Ho Yin Chan, Alan S Yu, Mei Liu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Association rule mining is a widely used data mining technique to uncover knowledge from large datasets. In healthcare, it can reveal meaningful patterns within electronic health records (EHR) that inform clinical decision-making and treatment strategies. However, many studies neglect the temporal aspects of EHR data, potentially overlooking patterns linked to specific time periods or sequence of clinical events. Recent advancements have introduced methods for mining temporal association rules, offering enhanced predictive and descriptive insights. We propose a multi-step framework that utilizes temporal pattern mining algorithm to extract actionable and temporal risk patterns for acute kidney injury (AKI) from EHR data. Our algorithm identified approximately 3,313 rules with 10 actionable features, characterized by low support and high confidence. These rules have a median support of 0.055 and a median confidence of 0.58. We highlight key rules, explore their potential clinical implications, and present a network-based view to provide actionable insights.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2025 ","pages":"115-123"},"PeriodicalIF":0.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150717/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144276890","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuhe Gao, Runxue Bao, Yuelyu Ji, Yiming Sun, Chenxi Song, Jeffrey P Ferraro, Ye Ye
{"title":"Transfer Learning with Clinical Concept Embeddings from Large Language Models.","authors":"Yuhe Gao, Runxue Bao, Yuelyu Ji, Yiming Sun, Chenxi Song, Jeffrey P Ferraro, Ye Ye","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Knowledge exchange is crucial in healthcare, particularly when leveraging data from multiple clinical sites to address data scarcity, reduce costs, and enable timely interventions. Transfer learning can facilitate cross-site knowledge transfer, yet a significant challenge is the heterogeneity in clinical concepts across different sites. Recently, Large Language Models (LLMs) have shown significant potential in capturing the semantic meanings of clinical concepts and mitigating heterogeneity in biomedicine. This study analyzed electronic health records from two large healthcare systems to assess the impact of semantic embeddings from LLMs on local models, shared models, and transfer learning models. The results indicate that domain-specific LLMs, such as Med-BERT, consistently outperform in local and direct transfer scenarios, whereas generic models like OpenAI embeddings may need fine-tuning for optimal performance. This study emphasizes the importance of domain-specific embeddings and meticulous model tuning for effective knowledge transfer in healthcare. It remains essential to investigate the balance the balance between the complexity of downstream tasks, the size of training samples, and the extent of model tuning.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2025 ","pages":"167-176"},"PeriodicalIF":0.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150738/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144276904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RD-LIVES: A Living Evidence Synthesis System for Rare Disease Treatment Efficacy and Safety.","authors":"Jinlian Wang, Hui Li, Hongfang Liu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Although rare diseases (RD) are gaining priority in healthcare worldwide, developing research policies for studying them in public settings remains challenging due to the limited evidence available. Evidence generation is crucial for rare diseases, requiring systematic assessment of study quality across multiple sources. Given the scarcity of patients, literature and clinical trial data for orphan drugs, we developed RD-LIVES-a tool designed to automatically accelerate evidence collection from literature and clinical trials for systematic reviews and meta-analyses. This tool enhances our understanding of treatment outcomes, determines appropriate follow-up durations, and informs the required treatment impact size for new drugs. Using Idiopathic Pulmonary Fibrosis (IPF) as an example, we demonstrate how RD-LIVES automates evidence collection and element extraction. The results indicate that RD-LIVES plays a vital role in designing costly prospective trials and has the potential to increase the likelihood of successful trial outcomes.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2025 ","pages":"607-613"},"PeriodicalIF":0.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150729/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144276881","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yishu Wei, Xindi Wang, Hanley Ong, Yiliang Zhou, Adam Flanders, George Shih, Yifan Peng
{"title":"Enhancing Disease Detection in Radiology Reports Through Fine-tuning Lightweight LLM on Weak Labels.","authors":"Yishu Wei, Xindi Wang, Hanley Ong, Yiliang Zhou, Adam Flanders, George Shih, Yifan Peng","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Despite significant progress in applying large language models (LLMs) to the medical domain, several limitations still prevent them from practical applications. Among these are the constraints on model size and the lack of cohort-specific labeled datasets. In this work, we investigated the potential of improving a lightweight LLM, such as Llama 3.1-8B, through fine-tuning with datasets using synthetic labels. Two tasks are jointly trained by combining their respective instruction datasets. When the quality of the task-specific synthetic labels is relatively high (e.g., generated by GPT4-o), Llama 3.1-8B achieves satisfactory performance on the open-ended disease detection task, with a micro F1 score of 0.91. Conversely, when the quality of the task-relevant synthetic labels is relatively low (e.g., from the MIMIC-CXR dataset), fine-tuned Llama 3.1-8B is able to surpass its noisy teacher labels (micro F1 score of 0.67 v.s. 0.63) when calibrated against curated labels, indicating the strong inherent underlying capability of the model. These findings demonstrate the potential offine-tuning LLMs with synthetic labels, offering a promising direction for future research on LLM specialization in the medical domain.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2025 ","pages":"614-623"},"PeriodicalIF":0.0,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150749/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144276868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}