Journal of the American Medical Informatics Association最新文献

筛选
英文 中文
Correction to: Artificial intelligence for optimizing recruitment and retention in clinical trials: a scoping review. 更正:人工智能优化临床试验的招募和保留:范围综述。
IF 4.7 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2025-01-01 DOI: 10.1093/jamia/ocae283
{"title":"Correction to: Artificial intelligence for optimizing recruitment and retention in clinical trials: a scoping review.","authors":"","doi":"10.1093/jamia/ocae283","DOIUrl":"10.1093/jamia/ocae283","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"260"},"PeriodicalIF":4.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11648702/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142583537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Is ChatGPT worthy enough for provisioning clinical decision support? ChatGPT 是否足以提供临床决策支持?
IF 4.7 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2025-01-01 DOI: 10.1093/jamia/ocae282
Partha Pratim Ray
{"title":"Is ChatGPT worthy enough for provisioning clinical decision support?","authors":"Partha Pratim Ray","doi":"10.1093/jamia/ocae282","DOIUrl":"10.1093/jamia/ocae282","url":null,"abstract":"","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"258-259"},"PeriodicalIF":4.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11648701/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142583648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine learning-based infection diagnostic and prognostic models in post-acute care settings: a systematic review. 基于机器学习的急性期后护理环境感染诊断和预后模型:系统综述。
IF 4.7 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2025-01-01 DOI: 10.1093/jamia/ocae278
Zidu Xu, Danielle Scharp, Mollie Hobensack, Jiancheng Ye, Jungang Zou, Sirui Ding, Jingjing Shang, Maxim Topaz
{"title":"Machine learning-based infection diagnostic and prognostic models in post-acute care settings: a systematic review.","authors":"Zidu Xu, Danielle Scharp, Mollie Hobensack, Jiancheng Ye, Jungang Zou, Sirui Ding, Jingjing Shang, Maxim Topaz","doi":"10.1093/jamia/ocae278","DOIUrl":"10.1093/jamia/ocae278","url":null,"abstract":"<p><strong>Objectives: </strong>This study aims to (1) review machine learning (ML)-based models for early infection diagnostic and prognosis prediction in post-acute care (PAC) settings, (2) identify key risk predictors influencing infection-related outcomes, and (3) examine the quality and limitations of these models.</p><p><strong>Materials and methods: </strong>PubMed, Web of Science, Scopus, IEEE Xplore, CINAHL, and ACM digital library were searched in February 2024. Eligible studies leveraged PAC data to develop and evaluate ML models for infection-related risks. Data extraction followed the CHARMS checklist. Quality appraisal followed the PROBAST tool. Data synthesis was guided by the socio-ecological conceptual framework.</p><p><strong>Results: </strong>Thirteen studies were included, mainly focusing on respiratory infections and nursing homes. Most used regression models with structured electronic health record data. Since 2020, there has been a shift toward advanced ML algorithms and multimodal data, biosensors, and clinical notes being significant sources of unstructured data. Despite these advances, there is insufficient evidence to support performance improvements over traditional models. Individual-level risk predictors, like impaired cognition, declined function, and tachycardia, were commonly used, while contextual-level predictors were barely utilized, consequently limiting model fairness. Major sources of bias included lack of external validation, inadequate model calibration, and insufficient consideration of data complexity.</p><p><strong>Discussion and conclusion: </strong>Despite the growth of advanced modeling approaches in infection-related models in PAC settings, evidence supporting their superiority remains limited. Future research should leverage a socio-ecological lens for predictor selection and model construction, exploring optimal data modalities and ML model usage in PAC, while ensuring rigorous methodologies and fairness considerations.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"241-252"},"PeriodicalIF":4.7,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11648729/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142631465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel generative multi-task representation learning approach for predicting postoperative complications in cardiac surgery patients.
IF 4.7 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2024-12-28 DOI: 10.1093/jamia/ocae316
Junbo Shen, Bing Xue, Thomas Kannampallil, Chenyang Lu, Joanna Abraham
{"title":"A novel generative multi-task representation learning approach for predicting postoperative complications in cardiac surgery patients.","authors":"Junbo Shen, Bing Xue, Thomas Kannampallil, Chenyang Lu, Joanna Abraham","doi":"10.1093/jamia/ocae316","DOIUrl":"https://doi.org/10.1093/jamia/ocae316","url":null,"abstract":"<p><strong>Objective: </strong>Early detection of surgical complications allows for timely therapy and proactive risk mitigation. Machine learning (ML) can be leveraged to identify and predict patient risks for postoperative complications. We developed and validated the effectiveness of predicting postoperative complications using a novel surgical Variational Autoencoder (surgVAE) that uncovers intrinsic patterns via cross-task and cross-cohort presentation learning.</p><p><strong>Materials and methods: </strong>This retrospective cohort study used data from the electronic health records of adult surgical patients over 4 years (2018-2021). Six key postoperative complications for cardiac surgery were assessed: acute kidney injury, atrial fibrillation, cardiac arrest, deep vein thrombosis or pulmonary embolism, blood transfusion, and other intraoperative cardiac events. We compared surgVAE's prediction performance against widely-used ML models and advanced representation learning and generative models under 5-fold cross-validation.</p><p><strong>Results: </strong>89 246 surgeries (49% male, median [IQR] age: 57 [45-69]) were included, with 6502 in the targeted cardiac surgery cohort (61% male, median [IQR] age: 60 [53-70]). surgVAE demonstrated generally superior performance over existing ML solutions across postoperative complications of cardiac surgery patients, achieving macro-averaged AUPRC of 0.409 and macro-averaged AUROC of 0.831, which were 3.4% and 3.7% higher, respectively, than the best alternative method (by AUPRC scores). Model interpretation using Integrated Gradients highlighted key risk factors based on preoperative variable importance.</p><p><strong>Discussion and conclusion: </strong>Our advanced representation learning framework surgVAE showed excellent discriminatory performance for predicting postoperative complications and addressing the challenges of data complexity, small cohort sizes, and low-frequency positive events. surgVAE enables data-driven predictions of patient risks and prognosis while enhancing the interpretability of patient risk profiles.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142899994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing patient representation learning with inferred family pedigrees improves disease risk prediction.
IF 4.7 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2024-12-26 DOI: 10.1093/jamia/ocae297
Xiayuan Huang, Jatin Arora, Abdullah Mesut Erzurumluoglu, Stephen A Stanhope, Daniel Lam, Hongyu Zhao, Zhihao Ding, Zuoheng Wang, Johann de Jong
{"title":"Enhancing patient representation learning with inferred family pedigrees improves disease risk prediction.","authors":"Xiayuan Huang, Jatin Arora, Abdullah Mesut Erzurumluoglu, Stephen A Stanhope, Daniel Lam, Hongyu Zhao, Zhihao Ding, Zuoheng Wang, Johann de Jong","doi":"10.1093/jamia/ocae297","DOIUrl":"https://doi.org/10.1093/jamia/ocae297","url":null,"abstract":"<p><strong>Background: </strong>Machine learning and deep learning are powerful tools for analyzing electronic health records (EHRs) in healthcare research. Although family health history has been recognized as a major predictor for a wide spectrum of diseases, research has so far adopted a limited view of family relations, essentially treating patients as independent samples in the analysis.</p><p><strong>Methods: </strong>To address this gap, we present ALIGATEHR, which models inferred family relations in a graph attention network augmented with an attention-based medical ontology representation, thus accounting for the complex influence of genetics, shared environmental exposures, and disease dependencies.</p><p><strong>Results: </strong>Taking disease risk prediction as a use case, we demonstrate that explicitly modeling family relations significantly improves predictions across the disease spectrum. We then show how ALIGATEHR's attention mechanism, which links patients' disease risk to their relatives' clinical profiles, successfully captures genetic aspects of diseases using longitudinal EHR diagnosis data. Finally, we use ALIGATEHR to successfully distinguish the 2 main inflammatory bowel disease subtypes with highly shared risk factors and symptoms (Crohn's disease and ulcerative colitis).</p><p><strong>Conclusion: </strong>Overall, our results highlight that family relations should not be overlooked in EHR research and illustrate ALIGATEHR's great potential for enhancing patient representation learning for predictive and interpretable modeling of EHRs.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142900000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
New indices to track interoperability among US hospitals.
IF 4.7 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2024-12-26 DOI: 10.1093/jamia/ocae289
Catherine E Strawley, Julia Adler-Milstein, A Jay Holmgren, Jordan Everson
{"title":"New indices to track interoperability among US hospitals.","authors":"Catherine E Strawley, Julia Adler-Milstein, A Jay Holmgren, Jordan Everson","doi":"10.1093/jamia/ocae289","DOIUrl":"https://doi.org/10.1093/jamia/ocae289","url":null,"abstract":"<p><strong>Objectives: </strong>To develop indices of US hospital interoperability to capture the current state and assess progress over time.</p><p><strong>Materials and methods: </strong>A Technical Expert Panel (TEP) informed selection of items from the American Hospital Association Health IT Supplement survey, which were aggregated into interoperability concepts (components) and then further combined into indices. Indices were refined through psychometric analysis and additional TEP input. Final indices included a \"Core Index\" measuring adoption of foundational interoperability capabilities, a \"Pathfinder Index\" representing adoption of advanced interoperability technologies and auxiliary exchange activities, and a \"Friction Index\" quantifying barriers. The first 2 indices were scored from 0 (no interoperability) to 100 (full interoperability); the Friction Index was scored 0 (no friction) to 100 (maximum friction). We calculated indices annually from 2021 to 2023, stratifying by hospital characteristics.</p><p><strong>Results: </strong>Items within components created reliable and meaningful measures, and associations between components within indices followed the TEP's expectations. Weighted mean scores for the Core (2023), Pathfinder (2022), and Friction (2023) Indices were 61, 57, and 30, respectively. Hospitals with 500+ beds (large), not designated as critical access, in metropolitan areas, and using market leading electronic health records had statistically significant higher mean scores on all indices. Index values also improved modestly over time.</p><p><strong>Discussion: </strong>Hospitals performed best on the Core Index. Given recent policy and programmatic initiatives, we anticipate continued improvement across all indices.</p><p><strong>Conclusion: </strong>Ongoing index tracking can inform policy impact evaluations and highlight persistent interoperability disparities across hospitals.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142900002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis of eligibility criteria clusters based on large language models for clinical trial design.
IF 4.7 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2024-12-26 DOI: 10.1093/jamia/ocae311
Alban Bornet, Philipp Khlebnikov, Florian Meer, Quentin Haas, Anthony Yazdani, Boya Zhang, Poorya Amini, Douglas Teodoro
{"title":"Analysis of eligibility criteria clusters based on large language models for clinical trial design.","authors":"Alban Bornet, Philipp Khlebnikov, Florian Meer, Quentin Haas, Anthony Yazdani, Boya Zhang, Poorya Amini, Douglas Teodoro","doi":"10.1093/jamia/ocae311","DOIUrl":"https://doi.org/10.1093/jamia/ocae311","url":null,"abstract":"<p><strong>Objectives: </strong>Clinical trials (CTs) are essential for improving patient care by evaluating new treatments' safety and efficacy. A key component in CT protocols is the study population defined by the eligibility criteria. This study aims to evaluate the effectiveness of large language models (LLMs) in encoding eligibility criterion information to support CT-protocol design.</p><p><strong>Materials and methods: </strong>We extracted eligibility criterion sections, phases, conditions, and interventions from CT protocols available in the ClinicalTrials.gov registry. Eligibility sections were split into individual rules using a criterion tokenizer and embedded using LLMs. The obtained representations were clustered. The quality and relevance of the clusters for protocol design was evaluated through 3 experiments: intrinsic alignment with protocol information and human expert cluster coherence assessment, extrinsic evaluation through CT-level classification tasks, and eligibility section generation.</p><p><strong>Results: </strong>Sentence embeddings fine-tuned using biomedical corpora produce clusters with the highest alignment to CT-level information. Human expert evaluation confirms that clusters are well structured and coherent. Despite the high information compression, clusters retain significant CT information, up to 97% of the classification performance obtained with raw embeddings. Finally, eligibility sections automatically generated using clusters achieve 95% of the ROUGE scores obtained with a generative LLM prompted with CT-protocol details, suggesting that clusters encapsulate information useful to CT-protocol design.</p><p><strong>Discussion: </strong>Clusters derived from sentence-level LLM embeddings effectively summarize complex eligibility criterion data while retaining relevant CT-protocol details. Clustering-based approaches provide a scalable enhancement in CT design that balances information compression with accuracy.</p><p><strong>Conclusions: </strong>Clustering eligibility criteria using LLM embeddings provides a practical and efficient method to summarize critical protocol information. We provide an interactive visualization of the pipeline here.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142899996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CARE-SD: classifier-based analysis for recognizing provider stigmatizing and doubt marker labels in electronic health records: model development and validation.
IF 4.7 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2024-12-26 DOI: 10.1093/jamia/ocae310
Andrew Walker, Annie Thorne, Sudeshna Das, Jennifer Love, Hannah L F Cooper, Melvin Livingston, Abeed Sarker
{"title":"CARE-SD: classifier-based analysis for recognizing provider stigmatizing and doubt marker labels in electronic health records: model development and validation.","authors":"Andrew Walker, Annie Thorne, Sudeshna Das, Jennifer Love, Hannah L F Cooper, Melvin Livingston, Abeed Sarker","doi":"10.1093/jamia/ocae310","DOIUrl":"https://doi.org/10.1093/jamia/ocae310","url":null,"abstract":"<p><strong>Objective: </strong>To detect and classify features of stigmatizing and biased language in intensive care electronic health records (EHRs) using natural language processing techniques.</p><p><strong>Materials and methods: </strong>We first created a lexicon and regular expression lists from literature-driven stem words for linguistic features of stigmatizing patient labels, doubt markers, and scare quotes within EHRs. The lexicon was further extended using Word2Vec and GPT 3.5, and refined through human evaluation. These lexicons were used to search for matches across 18 million sentences from the de-identified Medical Information Mart for Intensive Care-III (MIMIC-III) dataset. For each linguistic bias feature, 1000 sentence matches were sampled, labeled by expert clinical and public health annotators, and used to supervised learning classifiers.</p><p><strong>Results: </strong>Lexicon development from expanded literature stem-word lists resulted in a doubt marker lexicon containing 58 expressions, and a stigmatizing labels lexicon containing 127 expressions. Classifiers for doubt markers and stigmatizing labels had the highest performance, with macro F1-scores of 0.84 and 0.79, positive-label recall and precision values ranging from 0.71 to 0.86, and accuracies aligning closely with human annotator agreement (0.87).</p><p><strong>Discussion: </strong>This study demonstrated the feasibility of supervised classifiers in automatically identifying stigmatizing labels and doubt markers in medical text and identified trends in stigmatizing language use in an EHR setting. Additional labeled data may help improve lower scare quote model performance.</p><p><strong>Conclusions: </strong>Classifiers developed in this study showed high model performance and can be applied to identify patterns and target interventions to reduce stigmatizing labels and doubt markers in healthcare systems.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142899998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Descriptive epidemiology demonstrating the All of Us database as a versatile resource for the rare and undiagnosed disease community.
IF 4.7 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2024-12-23 DOI: 10.1093/jamia/ocae241
Drenen J Magee, Sierra Kicker, Aeisha Thomas
{"title":"Descriptive epidemiology demonstrating the All of Us database as a versatile resource for the rare and undiagnosed disease community.","authors":"Drenen J Magee, Sierra Kicker, Aeisha Thomas","doi":"10.1093/jamia/ocae241","DOIUrl":"https://doi.org/10.1093/jamia/ocae241","url":null,"abstract":"<p><strong>Objective: </strong>We aim to demonstrate the versatility of the All of Us database as an important source of rare and undiagnosed disease (RUD) data, because of its large size and range of data types.</p><p><strong>Materials and methods: </strong>We searched the public data browser, electronic health record (EHR), and several surveys to investigate the prevalence, mental health, healthcare access, and other data of select RUDs.</p><p><strong>Results: </strong>Several RUDs have participants in All of Us [eg, 75 of 100 rare infectious diseases (RIDs)]. We generated health-related data for undiagnosed, sickle cell disease (SCD), cystic fibrosis (CF), and infectious (2 diseases) and chronic (4 diseases) disease pools.</p><p><strong>Conclusion: </strong>Our results highlight the potential value of All of Us with both data breadth and depth to help identify possible solutions for shared and disease-specific biomedical and other problems such as healthcare access, thus enhancing diagnosis, treatment, prevention, and support for the RUD community.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142883641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lessons learned on information retrieval in electronic health records: a comparison of embedding models and pooling strategies.
IF 4.7 2区 医学
Journal of the American Medical Informatics Association Pub Date : 2024-12-20 DOI: 10.1093/jamia/ocae308
Skatje Myers, Timothy A Miller, Yanjun Gao, Matthew M Churpek, Anoop Mayampurath, Dmitriy Dligach, Majid Afshar
{"title":"Lessons learned on information retrieval in electronic health records: a comparison of embedding models and pooling strategies.","authors":"Skatje Myers, Timothy A Miller, Yanjun Gao, Matthew M Churpek, Anoop Mayampurath, Dmitriy Dligach, Majid Afshar","doi":"10.1093/jamia/ocae308","DOIUrl":"https://doi.org/10.1093/jamia/ocae308","url":null,"abstract":"<p><strong>Objectives: </strong>Applying large language models (LLMs) to the clinical domain is challenging due to the context-heavy nature of processing medical records. Retrieval-augmented generation (RAG) offers a solution by facilitating reasoning over large text sources. However, there are many parameters to optimize in just the retrieval system alone. This paper presents an ablation study exploring how different embedding models and pooling methods affect information retrieval for the clinical domain.</p><p><strong>Materials and methods: </strong>Evaluating on 3 retrieval tasks on 2 electronic health record (EHR) data sources, we compared 7 models, including medical- and general-domain models, specialized encoder embedding models, and off-the-shelf decoder LLMs. We also examine the choice of embedding pooling strategy for each model, independently on the query and the text to retrieve.</p><p><strong>Results: </strong>We found that the choice of embedding model significantly impacts retrieval performance, with BGE, a comparatively small general-domain model, consistently outperforming all others, including medical-specific models. However, our findings also revealed substantial variability across datasets and query text phrasings. We also determined the best pooling methods for each of these models to guide future design of retrieval systems.</p><p><strong>Discussion: </strong>The choice of embedding model, pooling strategy, and query formulation can significantly impact retrieval performance and the performance of these models on other public benchmarks does not necessarily transfer to new domains. The high variability in performance across different query phrasings suggests that the choice of query may need to be tuned and validated for each task, or even for each institution's EHR.</p><p><strong>Conclusion: </strong>This study provides empirical evidence to guide the selection of models and pooling strategies for RAG frameworks in healthcare applications. Further studies such as this one are vital for guiding empirically-grounded development of retrieval frameworks, such as in the context of RAG, for the clinical domain.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142866035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信