Towards cross-application model-agnostic federated cohort discovery.

IF 4.7 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association Pub Date : 2024-10-01 DOI:10.1093/jamia/ocae211

Nicholas J Dobbins, Michele Morris, Eugene Sadhu, Douglas MacFadden, Marc-Danie Nazaire, William Simons, Griffin Weber, Shawn Murphy, Shyam Visweswaran

{"title":"Towards cross-application model-agnostic federated cohort discovery.","authors":"Nicholas J Dobbins, Michele Morris, Eugene Sadhu, Douglas MacFadden, Marc-Danie Nazaire, William Simons, Griffin Weber, Shawn Murphy, Shyam Visweswaran","doi":"10.1093/jamia/ocae211","DOIUrl":null,"url":null,"abstract":"Objectives: To demonstrate that 2 popular cohort discovery tools, Leaf and the Shared Health Research Information Network (SHRINE), are readily interoperable. Specifically, we adapted Leaf to interoperate and function as a node in a federated data network that uses SHRINE and dynamically generate queries for heterogeneous data models.Materials and methods: SHRINE queries are designed to run on the Informatics for Integrating Biology & the Bedside (i2b2) data model. We created functionality in Leaf to interoperate with a SHRINE data network and dynamically translate SHRINE queries to other data models. We randomly selected 500 past queries from the SHRINE-based national Evolve to Next-Gen Accrual to Clinical Trials (ENACT) network for evaluation, and an additional 100 queries to refine and debug Leaf's translation functionality. We created a script for Leaf to convert the terms in the SHRINE queries into equivalent structured query language (SQL) concepts, which were then executed on 2 other data models.Results and discussion: 91.1% of the generated queries for non-i2b2 models returned counts within 5% (or ±5 patients for counts under 100) of i2b2, with 91.3% recall. Of the 8.9% of queries that exceeded the 5% margin, 77 of 89 (86.5%) were due to errors introduced by the Python script or the extract-transform-load process, which are easily fixed in a production deployment. The remaining errors were due to Leaf's translation function, which was later fixed.Conclusion: Our results support that cohort discovery applications such as Leaf and SHRINE can interoperate in federated data networks with heterogeneous data models.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"2202-2209"},"PeriodicalIF":4.7000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413448/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocae211","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives: To demonstrate that 2 popular cohort discovery tools, Leaf and the Shared Health Research Information Network (SHRINE), are readily interoperable. Specifically, we adapted Leaf to interoperate and function as a node in a federated data network that uses SHRINE and dynamically generate queries for heterogeneous data models.

Materials and methods: SHRINE queries are designed to run on the Informatics for Integrating Biology & the Bedside (i2b2) data model. We created functionality in Leaf to interoperate with a SHRINE data network and dynamically translate SHRINE queries to other data models. We randomly selected 500 past queries from the SHRINE-based national Evolve to Next-Gen Accrual to Clinical Trials (ENACT) network for evaluation, and an additional 100 queries to refine and debug Leaf's translation functionality. We created a script for Leaf to convert the terms in the SHRINE queries into equivalent structured query language (SQL) concepts, which were then executed on 2 other data models.

Results and discussion: 91.1% of the generated queries for non-i2b2 models returned counts within 5% (or ±5 patients for counts under 100) of i2b2, with 91.3% recall. Of the 8.9% of queries that exceeded the 5% margin, 77 of 89 (86.5%) were due to errors introduced by the Python script or the extract-transform-load process, which are easily fixed in a production deployment. The remaining errors were due to Leaf's translation function, which was later fixed.

Conclusion: Our results support that cohort discovery applications such as Leaf and SHRINE can interoperate in federated data networks with heterogeneous data models.

查看原文本刊更多论文

实现跨应用模型的联合队列发现。

目的证明两种流行的队列发现工具--Leaf和共享健康研究信息网络（SHRINE）--可随时互操作。具体来说，我们对Leaf进行了改编，使其能够互操作，并作为使用SHRINE的联合数据网络中的一个节点，为异构数据模型动态生成查询：SHRINE查询被设计为在生物与床边整合信息学（i2b2）数据模型上运行。我们在Leaf中创建了与SHRINE数据网络互操作的功能，并将SHRINE查询动态转换为其他数据模型。我们从基于 SHRINE 的国家级 "进化到下一代临床试验（ENACT）"网络中随机选取了 500 个过去的查询进行评估，并另外选取了 100 个查询来完善和调试利夫的翻译功能。我们为 Leaf 创建了一个脚本，用于将 SHRINE 查询中的术语转换为等效的结构化查询语言（SQL）概念，然后在另外两个数据模型上执行。结果与讨论：在为非 i2b2 模型生成的查询中，91.1% 返回的计数在 i2b2 的 5%（或计数低于 100 的±5 名患者）以内，召回率为 91.3%。在 8.9% 超过 5% 的查询中，89 项中的 77 项（86.5%）是由于 Python 脚本或提取-转换-加载过程中引入的错误造成的，这些错误在生产部署中很容易修复。其余的错误是由于 Leaf 的翻译功能造成的，该功能后来得到了修复：我们的研究结果表明，像 Leaf 和 SHRINE 这样的队列发现应用程序可以在具有异构数据模型的联合数据网络中实现互操作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of the American Medical Informatics Association 医学-计算机：跨学科应用

CiteScore

14.50

自引率

7.80%

发文量

230

审稿时长

3-8 weeks

期刊介绍： JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.