Yuhe Gao, Runxue Bao, Yuelyu Ji, Yiming Sun, Chenxi Song, Jeffrey P Ferraro, Ye Ye
{"title":"基于大型语言临床概念嵌入的迁移学习。","authors":"Yuhe Gao, Runxue Bao, Yuelyu Ji, Yiming Sun, Chenxi Song, Jeffrey P Ferraro, Ye Ye","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Knowledge exchange is crucial in healthcare, particularly when leveraging data from multiple clinical sites to address data scarcity, reduce costs, and enable timely interventions. Transfer learning can facilitate cross-site knowledge transfer, yet a significant challenge is the heterogeneity in clinical concepts across different sites. Recently, Large Language Models (LLMs) have shown significant potential in capturing the semantic meanings of clinical concepts and mitigating heterogeneity in biomedicine. This study analyzed electronic health records from two large healthcare systems to assess the impact of semantic embeddings from LLMs on local models, shared models, and transfer learning models. The results indicate that domain-specific LLMs, such as Med-BERT, consistently outperform in local and direct transfer scenarios, whereas generic models like OpenAI embeddings may need fine-tuning for optimal performance. This study emphasizes the importance of domain-specific embeddings and meticulous model tuning for effective knowledge transfer in healthcare. It remains essential to investigate the balance the balance between the complexity of downstream tasks, the size of training samples, and the extent of model tuning.</p>","PeriodicalId":72181,"journal":{"name":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","volume":"2025 ","pages":"167-176"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150738/pdf/","citationCount":"0","resultStr":"{\"title\":\"Transfer Learning with Clinical Concept Embeddings from Large Language Models.\",\"authors\":\"Yuhe Gao, Runxue Bao, Yuelyu Ji, Yiming Sun, Chenxi Song, Jeffrey P Ferraro, Ye Ye\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Knowledge exchange is crucial in healthcare, particularly when leveraging data from multiple clinical sites to address data scarcity, reduce costs, and enable timely interventions. Transfer learning can facilitate cross-site knowledge transfer, yet a significant challenge is the heterogeneity in clinical concepts across different sites. Recently, Large Language Models (LLMs) have shown significant potential in capturing the semantic meanings of clinical concepts and mitigating heterogeneity in biomedicine. This study analyzed electronic health records from two large healthcare systems to assess the impact of semantic embeddings from LLMs on local models, shared models, and transfer learning models. The results indicate that domain-specific LLMs, such as Med-BERT, consistently outperform in local and direct transfer scenarios, whereas generic models like OpenAI embeddings may need fine-tuning for optimal performance. This study emphasizes the importance of domain-specific embeddings and meticulous model tuning for effective knowledge transfer in healthcare. It remains essential to investigate the balance the balance between the complexity of downstream tasks, the size of training samples, and the extent of model tuning.</p>\",\"PeriodicalId\":72181,\"journal\":{\"name\":\"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science\",\"volume\":\"2025 \",\"pages\":\"167-176\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12150738/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
Transfer Learning with Clinical Concept Embeddings from Large Language Models.
Knowledge exchange is crucial in healthcare, particularly when leveraging data from multiple clinical sites to address data scarcity, reduce costs, and enable timely interventions. Transfer learning can facilitate cross-site knowledge transfer, yet a significant challenge is the heterogeneity in clinical concepts across different sites. Recently, Large Language Models (LLMs) have shown significant potential in capturing the semantic meanings of clinical concepts and mitigating heterogeneity in biomedicine. This study analyzed electronic health records from two large healthcare systems to assess the impact of semantic embeddings from LLMs on local models, shared models, and transfer learning models. The results indicate that domain-specific LLMs, such as Med-BERT, consistently outperform in local and direct transfer scenarios, whereas generic models like OpenAI embeddings may need fine-tuning for optimal performance. This study emphasizes the importance of domain-specific embeddings and meticulous model tuning for effective knowledge transfer in healthcare. It remains essential to investigate the balance the balance between the complexity of downstream tasks, the size of training samples, and the extent of model tuning.