FHIR-Former: enhancing clinical predictions through Fast Healthcare Interoperability Resources and large language models.

IF 4.6 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association Pub Date : 2025-10-13 DOI:10.1093/jamia/ocaf165

Merlin Engelke, Giulia Baldini, Jens Kleesiek, Felix Nensa, Amin Dada

{"title":"FHIR-Former: enhancing clinical predictions through Fast Healthcare Interoperability Resources and large language models.","authors":"Merlin Engelke, Giulia Baldini, Jens Kleesiek, Felix Nensa, Amin Dada","doi":"10.1093/jamia/ocaf165","DOIUrl":null,"url":null,"abstract":"Objective: To address the challenges of data heterogeneity and manual feature engineering in clinical predictive modeling, we introduce FHIR-Former, an open-source framework integrating Fast Healthcare Interoperability Resources (FHIR) with large language models (LLMs) to automate and standardize clinical prediction tasks.Materials and methods: FHIR-Former dynamically processes structured (eg, lab results, medications) and unstructured (eg, clinical notes) data from FHIR resources. The pipeline supports multiple classification tasks, including 30-day readmission, imaging study prediction, and ICD code classification. Leveraging open-source LLMs (GeBERTa), we trained models on 1.1 million data points across ten FHIR resources using retrospective inpatient data (2018-2024). Hyperparameters were optimized via Bayesian methods, and outputs were mapped to FHIR RiskAssessment resources for interoperability.Results: FHIR-Former achieved an F1-score of 70.7% and accuracy of 72.9% for 30-day readmission, 51.8% F1-score (88.1% accuracy) for mortality prediction, and 61% macro F1-score for imaging study classification. The ICD code prediction model attained 94% accuracy. Performance demonstrated promising performance for readmission and showed scalability across tasks without manual feature engineering.Discussion: FHIR-Former eliminates institution-specific preprocessing by adapting to diverse FHIR implementations, enabling seamless integration of multimodal data. Its configurable architecture outperformed prior frameworks reliant on static inputs or limited to unstructured text. Real-time risk scores embedded in FHIR servers enhance clinical workflows without disrupting existing practices.Conclusion: By harmonizing FHIR standardization with LLM flexibility, FHIR-Former advances scalable, interoperable predictive modeling in healthcare. The open-source framework facilitates automation, improves resource allocation, and supports personalized decision-making, bridging gaps between AI innovation and clinical practice.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2025-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocaf165","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: To address the challenges of data heterogeneity and manual feature engineering in clinical predictive modeling, we introduce FHIR-Former, an open-source framework integrating Fast Healthcare Interoperability Resources (FHIR) with large language models (LLMs) to automate and standardize clinical prediction tasks.

Materials and methods: FHIR-Former dynamically processes structured (eg, lab results, medications) and unstructured (eg, clinical notes) data from FHIR resources. The pipeline supports multiple classification tasks, including 30-day readmission, imaging study prediction, and ICD code classification. Leveraging open-source LLMs (GeBERTa), we trained models on 1.1 million data points across ten FHIR resources using retrospective inpatient data (2018-2024). Hyperparameters were optimized via Bayesian methods, and outputs were mapped to FHIR RiskAssessment resources for interoperability.

Results: FHIR-Former achieved an F1-score of 70.7% and accuracy of 72.9% for 30-day readmission, 51.8% F1-score (88.1% accuracy) for mortality prediction, and 61% macro F1-score for imaging study classification. The ICD code prediction model attained 94% accuracy. Performance demonstrated promising performance for readmission and showed scalability across tasks without manual feature engineering.

Discussion: FHIR-Former eliminates institution-specific preprocessing by adapting to diverse FHIR implementations, enabling seamless integration of multimodal data. Its configurable architecture outperformed prior frameworks reliant on static inputs or limited to unstructured text. Real-time risk scores embedded in FHIR servers enhance clinical workflows without disrupting existing practices.

Conclusion: By harmonizing FHIR standardization with LLM flexibility, FHIR-Former advances scalable, interoperable predictive modeling in healthcare. The open-source framework facilitates automation, improves resource allocation, and supports personalized decision-making, bridging gaps between AI innovation and clinical practice.

查看原文本刊更多论文

FHIR-Former：通过快速医疗互操作性资源和大型语言模型增强临床预测。

目的：为了解决临床预测建模中数据异构和手动特征工程的挑战，我们引入了FHIR- former，这是一个将快速医疗互操作性资源（FHIR）与大型语言模型（llm）集成在一起的开源框架，用于自动化和标准化临床预测任务。材料和方法：FHIR- former动态处理来自FHIR资源的结构化（如实验室结果、药物）和非结构化（如临床记录）数据。该管道支持多种分类任务，包括30天再入院、成像研究预测和ICD代码分类。利用开源法学硕士（GeBERTa），我们使用回顾性住院患者数据（2018-2024）在10个FHIR资源中的110万个数据点上训练模型。通过贝叶斯方法优化超参数，并将输出映射到FHIR风险评估资源以实现互操作性。结果：FHIR-Former对30天再入院患者的f1评分为70.7%，准确率为72.9%，对死亡率预测的f1评分为51.8%，准确率为88.1%，对影像学研究分类的宏观f1评分为61%。ICD代码预测模型的准确率达到94%。性能展示了重入的良好性能，并展示了跨任务的可伸缩性，无需手动特征工程。讨论：FHIR- former通过适应不同的FHIR实现消除了机构特定的预处理，实现了多模式数据的无缝集成。其可配置架构优于依赖静态输入或仅限于非结构化文本的先前框架。嵌入在FHIR服务器中的实时风险评分可以在不破坏现有实践的情况下增强临床工作流程。结论：通过协调FHIR标准化和LLM灵活性，FHIR- former在医疗保健领域推进了可扩展、可互操作的预测建模。开源框架促进了自动化，改善了资源分配，并支持个性化决策，弥合了人工智能创新与临床实践之间的差距。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of the American Medical Informatics Association 医学-计算机：跨学科应用

CiteScore

14.50

自引率

7.80%

发文量

230

审稿时长

3-8 weeks

期刊介绍： JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.