Transfer Learning with Clinical Concept Embeddings from Large Language Models.

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science Pub Date : 2025-06-10 eCollection Date: 2025-01-01

Yuhe Gao, Runxue Bao, Yuelyu Ji, Yiming Sun, Chenxi Song, Jeffrey P Ferraro, Ye Ye

{"title":"Transfer Learning with Clinical Concept Embeddings from Large Language Models.","authors":"Yuhe Gao, Runxue Bao, Yuelyu Ji, Yiming Sun, Chenxi Song, Jeffrey P Ferraro, Ye Ye","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Knowledge exchange is crucial in healthcare, particularly when leveraging data from multiple clinical sites to address data scarcity, reduce costs, and enable timely interventions. Transfer learning can facilitate cross-site knowledge transfer, yet a significant challenge is the heterogeneity in clinical concepts across different sites. Recently, Large Language Models (LLMs) have shown significant potential in capturing the semantic meanings of clinical concepts and mitigating heterogeneity in biomedicine. This study analyzed electronic health records from two large healthcare systems to assess the impact of semantic embeddings from LLMs on local models, shared models, and transfer learning models. The results indicate that domain-specific LLMs, such as Med-BERT, consistently outperform in local and direct transfer scenarios, whereas generic models like OpenAI embeddings may need fine-tuning for optimal performance. This study emphasizes the importance of domain-specific embeddings and meticulous model tuning for effective knowledge transfer in healthcare. It remains essential to investigate the balance the balance between the complexity of downstream tasks, the size of training samples, and the extent of model tuning.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2025 ","pages":"167-176"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150738/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Knowledge exchange is crucial in healthcare, particularly when leveraging data from multiple clinical sites to address data scarcity, reduce costs, and enable timely interventions. Transfer learning can facilitate cross-site knowledge transfer, yet a significant challenge is the heterogeneity in clinical concepts across different sites. Recently, Large Language Models (LLMs) have shown significant potential in capturing the semantic meanings of clinical concepts and mitigating heterogeneity in biomedicine. This study analyzed electronic health records from two large healthcare systems to assess the impact of semantic embeddings from LLMs on local models, shared models, and transfer learning models. The results indicate that domain-specific LLMs, such as Med-BERT, consistently outperform in local and direct transfer scenarios, whereas generic models like OpenAI embeddings may need fine-tuning for optimal performance. This study emphasizes the importance of domain-specific embeddings and meticulous model tuning for effective knowledge transfer in healthcare. It remains essential to investigate the balance the balance between the complexity of downstream tasks, the size of training samples, and the extent of model tuning.

本刊更多论文

基于大型语言临床概念嵌入的迁移学习。

知识交换在医疗保健中至关重要，特别是在利用来自多个临床站点的数据来解决数据短缺、降低成本和实现及时干预时。迁移学习可以促进跨站点的知识转移，但一个重要的挑战是不同站点的临床概念的异质性。近年来，大型语言模型（llm）在捕获临床概念的语义和减轻生物医学的异质性方面显示出巨大的潜力。本研究分析了来自两个大型医疗保健系统的电子健康记录，以评估llm语义嵌入对本地模型、共享模型和迁移学习模型的影响。结果表明，特定领域的llm，如Med-BERT，在本地和直接传输场景中始终表现出色，而像OpenAI嵌入这样的通用模型可能需要微调以获得最佳性能。本研究强调了领域特定嵌入和细致模型调优对于医疗保健中有效知识转移的重要性。研究下游任务的复杂性、训练样本的大小和模型调整的程度之间的平衡仍然是必要的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science

自引率

0.00%

发文量