FedFSA：跨机构功能状态确定的混合联合框架。

IF 4 2区医学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Journal of Biomedical Informatics Pub Date : 2024-03-06 DOI:10.1016/j.jbi.2024.104623

Sunyang Fu , Heling Jia , Maria Vassilaki , Vipina K. Keloth , Yifang Dang , Yujia Zhou , Muskan Garg , Ronald C. Petersen , Jennifer St Sauver , Sungrim Moon , Liwei Wang , Andrew Wen , Fang Li , Hua Xu , Cui Tao , Jungwei Fan , Hongfang Liu , Sunghwan Sohn

{"title":"FedFSA：跨机构功能状态确定的混合联合框架。","authors":"Sunyang Fu , Heling Jia , Maria Vassilaki , Vipina K. Keloth , Yifang Dang , Yujia Zhou , Muskan Garg , Ronald C. Petersen , Jennifer St Sauver , Sungrim Moon , Liwei Wang , Andrew Wen , Fang Li , Hua Xu , Cui Tao , Jungwei Fan , Hongfang Liu , Sunghwan Sohn","doi":"10.1016/j.jbi.2024.104623","DOIUrl":null,"url":null,"abstract":"<div><h3>Introduction</h3><p>Patients' functional status assesses their independence in performing activities of daily living, including basic ADLs (bADL), and more complex instrumental activities (iADL). Existing studies have discovered that patients’ functional status is a strong predictor of health outcomes, particularly in older adults. Depite their usefulness, much of the functional status information is stored in electronic health records (EHRs) in either semi-structured or free text formats. This indicates the pressing need to leverage computational approaches such as natural language processing (NLP) to accelerate the curation of functional status information. In this study, we introduced FedFSA, a hybrid and federated NLP framework designed to extract functional status information from EHRs across multiple healthcare institutions.</p></div><div><h3>Methods</h3><p>FedFSA consists of four major components: 1) individual sites (clients) with their private local data, 2) a rule-based information extraction (IE) framework for ADL extraction, 3) a BERT model for functional status impairment classification, and 4) a concept normalizer. The framework was implemented using the OHNLP Backbone for rule-based IE and open-source Flower and PyTorch library for federated BERT components. For gold standard data generation, we carried out corpus annotation to identify functional status-related expressions based on ICF definitions. Four healthcare institutions were included in the study. To assess FedFSA, we evaluated the performance of category- and institution-specific ADL extraction across different experimental designs.</p></div><div><h3>Results</h3><p>ADL extraction performance ranges from an F1-score of 0.907 to 0.986 for bADL and 0.825 to 0.951 for iADL across the four healthcare sites. The performance for ADL extraction with impairment ranges from an F1-score of 0.722 to 0.954 for bADL and 0.674 to 0.813 for iADL across four healthcare sites. For category-specific ADL extraction, laundry and transferring yielded relatively high performance, while dressing, medication, bathing, and continence achieved moderate-high performance. Conversely, food preparation and toileting showed low performance.</p></div><div><h3>Conclusion</h3><p>NLP performance varied across ADL categories and healthcare sites. Federated learning using a FedFSA framework performed higher than non-federated learning for impaired ADL extraction at all healthcare sites. Our study demonstrated the potential of the federated learning framework in functional status extraction and impairment classification in EHRs, exemplifying the importance of a large-scale, multi-institutional collaborative development effort.</p></div>","PeriodicalId":15263,"journal":{"name":"Journal of Biomedical Informatics","volume":null,"pages":null},"PeriodicalIF":4.0000,"publicationDate":"2024-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FedFSA: Hybrid and federated framework for functional status ascertainment across institutions\",\"authors\":\"Sunyang Fu , Heling Jia , Maria Vassilaki , Vipina K. Keloth , Yifang Dang , Yujia Zhou , Muskan Garg , Ronald C. Petersen , Jennifer St Sauver , Sungrim Moon , Liwei Wang , Andrew Wen , Fang Li , Hua Xu , Cui Tao , Jungwei Fan , Hongfang Liu , Sunghwan Sohn\",\"doi\":\"10.1016/j.jbi.2024.104623\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Introduction</h3><p>Patients' functional status assesses their independence in performing activities of daily living, including basic ADLs (bADL), and more complex instrumental activities (iADL). Existing studies have discovered that patients’ functional status is a strong predictor of health outcomes, particularly in older adults. Depite their usefulness, much of the functional status information is stored in electronic health records (EHRs) in either semi-structured or free text formats. This indicates the pressing need to leverage computational approaches such as natural language processing (NLP) to accelerate the curation of functional status information. In this study, we introduced FedFSA, a hybrid and federated NLP framework designed to extract functional status information from EHRs across multiple healthcare institutions.</p></div><div><h3>Methods</h3><p>FedFSA consists of four major components: 1) individual sites (clients) with their private local data, 2) a rule-based information extraction (IE) framework for ADL extraction, 3) a BERT model for functional status impairment classification, and 4) a concept normalizer. The framework was implemented using the OHNLP Backbone for rule-based IE and open-source Flower and PyTorch library for federated BERT components. For gold standard data generation, we carried out corpus annotation to identify functional status-related expressions based on ICF definitions. Four healthcare institutions were included in the study. To assess FedFSA, we evaluated the performance of category- and institution-specific ADL extraction across different experimental designs.</p></div><div><h3>Results</h3><p>ADL extraction performance ranges from an F1-score of 0.907 to 0.986 for bADL and 0.825 to 0.951 for iADL across the four healthcare sites. The performance for ADL extraction with impairment ranges from an F1-score of 0.722 to 0.954 for bADL and 0.674 to 0.813 for iADL across four healthcare sites. For category-specific ADL extraction, laundry and transferring yielded relatively high performance, while dressing, medication, bathing, and continence achieved moderate-high performance. Conversely, food preparation and toileting showed low performance.</p></div><div><h3>Conclusion</h3><p>NLP performance varied across ADL categories and healthcare sites. Federated learning using a FedFSA framework performed higher than non-federated learning for impaired ADL extraction at all healthcare sites. Our study demonstrated the potential of the federated learning framework in functional status extraction and impairment classification in EHRs, exemplifying the importance of a large-scale, multi-institutional collaborative development effort.</p></div>\",\"PeriodicalId\":15263,\"journal\":{\"name\":\"Journal of Biomedical Informatics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.0000,\"publicationDate\":\"2024-03-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Biomedical Informatics\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1532046424000418\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomedical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1532046424000418","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

简介患者的功能状况评估他们在日常生活活动中的独立性，包括基本的日常活动（bADL）和更复杂的工具性活动（iADL）。现有研究发现，患者的功能状况是预测健康状况的重要指标，尤其是对老年人而言。大部分功能状态信息都以半结构化或自由文本格式存储在电子健康记录（EHR）中。这凸显了利用自然语言处理（NLP）等计算方法加速功能状态信息整理的迫切需要。在本研究中，我们介绍了 FedFSA，这是一个混合的联合 NLP 框架，旨在从多个医疗机构的电子病历中提取功能状态信息：拟议的 FedFSA 由四个主要部分组成：1）带有私人本地数据的单个站点（客户）；2）用于提取 ADL 的基于规则的信息提取（IE）框架；3）用于功能状态损伤分类的 BERT 模型；4）概念归一化器。该框架是利用基于规则的信息提取（IE）的 OHNLP 骨干和开源 Flower 以及用于联合 BERT 组件的 PyTorch 库实现的。为了生成金标准数据，我们根据 ICF 定义进行了语料注释，以识别与功能状态相关的表达。研究包括四家医疗机构。为了评估 FedFSA，我们在不同的实验设计中评估了针对类别和机构的 ADL 提取性能：结果：在四个医疗机构中，bADL 和 iADL ADL 提取的 F1 分数分别为 0.907 至 0.986 和 0.825 至 0.951。在四个医疗站点中，有损伤的 ADL 提取的 F1 分数范围为：bADL 为 0.722 到 0.954，iADL 为 0.674 到 0.813。在特定类别的 ADL 提取中，洗衣和转移的性能相对较高，而穿衣、用药、洗澡和大小便失禁的性能则处于中等偏上水平。反之，食物准备和如厕则表现较差：结论：NLP的性能因ADL类别和医疗场所而异。在所有医疗机构中，使用 FedFSA 框架的联合学习在提取受损的 ADL 方面的表现均高于非联合学习。我们的研究证明了联合学习框架在电子病历功能状态提取和损伤分类方面的潜力，是大规模、多机构合作的典范。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

FedFSA: Hybrid and federated framework for functional status ascertainment across institutions

查看原文本刊更多论文

FedFSA: Hybrid and federated framework for functional status ascertainment across institutions

Introduction

Patients' functional status assesses their independence in performing activities of daily living, including basic ADLs (bADL), and more complex instrumental activities (iADL). Existing studies have discovered that patients’ functional status is a strong predictor of health outcomes, particularly in older adults. Depite their usefulness, much of the functional status information is stored in electronic health records (EHRs) in either semi-structured or free text formats. This indicates the pressing need to leverage computational approaches such as natural language processing (NLP) to accelerate the curation of functional status information. In this study, we introduced FedFSA, a hybrid and federated NLP framework designed to extract functional status information from EHRs across multiple healthcare institutions.

Methods

FedFSA consists of four major components: 1) individual sites (clients) with their private local data, 2) a rule-based information extraction (IE) framework for ADL extraction, 3) a BERT model for functional status impairment classification, and 4) a concept normalizer. The framework was implemented using the OHNLP Backbone for rule-based IE and open-source Flower and PyTorch library for federated BERT components. For gold standard data generation, we carried out corpus annotation to identify functional status-related expressions based on ICF definitions. Four healthcare institutions were included in the study. To assess FedFSA, we evaluated the performance of category- and institution-specific ADL extraction across different experimental designs.

Results

ADL extraction performance ranges from an F1-score of 0.907 to 0.986 for bADL and 0.825 to 0.951 for iADL across the four healthcare sites. The performance for ADL extraction with impairment ranges from an F1-score of 0.722 to 0.954 for bADL and 0.674 to 0.813 for iADL across four healthcare sites. For category-specific ADL extraction, laundry and transferring yielded relatively high performance, while dressing, medication, bathing, and continence achieved moderate-high performance. Conversely, food preparation and toileting showed low performance.

Conclusion

NLP performance varied across ADL categories and healthcare sites. Federated learning using a FedFSA framework performed higher than non-federated learning for impaired ADL extraction at all healthcare sites. Our study demonstrated the potential of the federated learning framework in functional status extraction and impairment classification in EHRs, exemplifying the importance of a large-scale, multi-institutional collaborative development effort.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Biomedical Informatics 医学-计算机：跨学科应用

CiteScore

8.90

自引率

6.70%

发文量

243

审稿时长

32 days

期刊介绍： The Journal of Biomedical Informatics reflects a commitment to high-quality original research papers, reviews, and commentaries in the area of biomedical informatics methodology. Although we publish articles motivated by applications in the biomedical sciences (for example, clinical medicine, health care, population health, and translational bioinformatics), the journal emphasizes reports of new methodologies and techniques that have general applicability and that form the basis for the evolving science of biomedical informatics. Articles on medical devices; evaluations of implemented systems (including clinical trials of information technologies); or papers that provide insight into a biological process, a specific disease, or treatment options would generally be more suitable for publication in other venues. Papers on applications of signal processing and image analysis are often more suitable for biomedical engineering journals or other informatics journals, although we do publish papers that emphasize the information management and knowledge representation/modeling issues that arise in the storage and use of biological signals and images. System descriptions are welcome if they illustrate and substantiate the underlying methodology that is the principal focus of the report and an effort is made to address the generalizability and/or range of application of that methodology. Note also that, given the international nature of JBI, papers that deal with specific languages other than English, or with country-specific health systems or approaches, are acceptable for JBI only if they offer generalizable lessons that are relevant to the broad JBI readership, regardless of their country, language, culture, or health system.