Nicholas J Dobbins, Michele Morris, Eugene Sadhu, Douglas MacFadden, Marc-Danie Nazaire, William Simons, Griffin Weber, Shawn Murphy, Shyam Visweswaran
{"title":"Towards cross-application model-agnostic federated cohort discovery.","authors":"Nicholas J Dobbins, Michele Morris, Eugene Sadhu, Douglas MacFadden, Marc-Danie Nazaire, William Simons, Griffin Weber, Shawn Murphy, Shyam Visweswaran","doi":"10.1093/jamia/ocae211","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>To demonstrate that 2 popular cohort discovery tools, Leaf and the Shared Health Research Information Network (SHRINE), are readily interoperable. Specifically, we adapted Leaf to interoperate and function as a node in a federated data network that uses SHRINE and dynamically generate queries for heterogeneous data models.</p><p><strong>Materials and methods: </strong>SHRINE queries are designed to run on the Informatics for Integrating Biology & the Bedside (i2b2) data model. We created functionality in Leaf to interoperate with a SHRINE data network and dynamically translate SHRINE queries to other data models. We randomly selected 500 past queries from the SHRINE-based national Evolve to Next-Gen Accrual to Clinical Trials (ENACT) network for evaluation, and an additional 100 queries to refine and debug Leaf's translation functionality. We created a script for Leaf to convert the terms in the SHRINE queries into equivalent structured query language (SQL) concepts, which were then executed on 2 other data models.</p><p><strong>Results and discussion: </strong>91.1% of the generated queries for non-i2b2 models returned counts within 5% (or ±5 patients for counts under 100) of i2b2, with 91.3% recall. Of the 8.9% of queries that exceeded the 5% margin, 77 of 89 (86.5%) were due to errors introduced by the Python script or the extract-transform-load process, which are easily fixed in a production deployment. The remaining errors were due to Leaf's translation function, which was later fixed.</p><p><strong>Conclusion: </strong>Our results support that cohort discovery applications such as Leaf and SHRINE can interoperate in federated data networks with heterogeneous data models.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"2202-2209"},"PeriodicalIF":4.7000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413448/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocae211","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: To demonstrate that 2 popular cohort discovery tools, Leaf and the Shared Health Research Information Network (SHRINE), are readily interoperable. Specifically, we adapted Leaf to interoperate and function as a node in a federated data network that uses SHRINE and dynamically generate queries for heterogeneous data models.
Materials and methods: SHRINE queries are designed to run on the Informatics for Integrating Biology & the Bedside (i2b2) data model. We created functionality in Leaf to interoperate with a SHRINE data network and dynamically translate SHRINE queries to other data models. We randomly selected 500 past queries from the SHRINE-based national Evolve to Next-Gen Accrual to Clinical Trials (ENACT) network for evaluation, and an additional 100 queries to refine and debug Leaf's translation functionality. We created a script for Leaf to convert the terms in the SHRINE queries into equivalent structured query language (SQL) concepts, which were then executed on 2 other data models.
Results and discussion: 91.1% of the generated queries for non-i2b2 models returned counts within 5% (or ±5 patients for counts under 100) of i2b2, with 91.3% recall. Of the 8.9% of queries that exceeded the 5% margin, 77 of 89 (86.5%) were due to errors introduced by the Python script or the extract-transform-load process, which are easily fixed in a production deployment. The remaining errors were due to Leaf's translation function, which was later fixed.
Conclusion: Our results support that cohort discovery applications such as Leaf and SHRINE can interoperate in federated data networks with heterogeneous data models.
期刊介绍:
JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.