PED-DATA: A Privacy-Preserving Framework for Data-Driven, Pediatric Multi-Center Studies.

Gorkem Yilmaz, Jonathan M Mang, Markus Metzler, Hans-Ulrich Prokosch, Manfred Rauh, Jakob Zierk
{"title":"PED-DATA: A Privacy-Preserving Framework for Data-Driven, Pediatric Multi-Center Studies.","authors":"Gorkem Yilmaz, Jonathan M Mang, Markus Metzler, Hans-Ulrich Prokosch, Manfred Rauh, Jakob Zierk","doi":"10.3233/SHTI251409","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Data-driven analysis of clinical databases is an efficient method for clinical knowledge generation, which is especially suitable when exceptional ethical and practical restrictions apply, such as in pediatrics. In the multi-center PEDREF 2.0 study, we are analyzing children's laboratory test results, diagnoses, and procedures from more than 20 German tertiary care centers to establish pediatric reference intervals. The PEDREF 2.0 study uses the framework of the German Medical Informatics Initiative, but the specific study needs require the development of a customized module for distributed pediatric analyses.</p><p><strong>Methods: </strong>We developed the Pediatric Distributed Analysis, Anonymization, and Aggregation Module (PED-DATA), which is a containerized application that we deployed to all participating centers. PED-DATA transforms the input datasets to a harmonized internal representation and enables their decentralized analysis in compliance with data protection rules, resulting in an anonymous output dataset that is transferred for central analysis.</p><p><strong>Results: </strong>In a preliminary analysis of data from 15 centers, we analyzed 52,807,236 laboratory test results from 753,774 different patients (323,943 to 4,338,317 test results per laboratory test), enabling us to establish pediatric reference intervals with previously unmatched precision.</p><p><strong>Conclusion: </strong>PED-DATA facilitates the implementation of pediatric data-driven multicenter studies in a decentralized and privacy-respecting manner, and its use throughout German University Hospitals in the PEDREF 2.0 study demonstrates its usefulness in a real-world use case.</p>","PeriodicalId":94357,"journal":{"name":"Studies in health technology and informatics","volume":"331 ","pages":"307-317"},"PeriodicalIF":0.0000,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Studies in health technology and informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3233/SHTI251409","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: Data-driven analysis of clinical databases is an efficient method for clinical knowledge generation, which is especially suitable when exceptional ethical and practical restrictions apply, such as in pediatrics. In the multi-center PEDREF 2.0 study, we are analyzing children's laboratory test results, diagnoses, and procedures from more than 20 German tertiary care centers to establish pediatric reference intervals. The PEDREF 2.0 study uses the framework of the German Medical Informatics Initiative, but the specific study needs require the development of a customized module for distributed pediatric analyses.

Methods: We developed the Pediatric Distributed Analysis, Anonymization, and Aggregation Module (PED-DATA), which is a containerized application that we deployed to all participating centers. PED-DATA transforms the input datasets to a harmonized internal representation and enables their decentralized analysis in compliance with data protection rules, resulting in an anonymous output dataset that is transferred for central analysis.

Results: In a preliminary analysis of data from 15 centers, we analyzed 52,807,236 laboratory test results from 753,774 different patients (323,943 to 4,338,317 test results per laboratory test), enabling us to establish pediatric reference intervals with previously unmatched precision.

Conclusion: PED-DATA facilitates the implementation of pediatric data-driven multicenter studies in a decentralized and privacy-respecting manner, and its use throughout German University Hospitals in the PEDREF 2.0 study demonstrates its usefulness in a real-world use case.

PED-DATA:数据驱动的隐私保护框架,儿科多中心研究。
临床数据库的数据驱动分析是临床知识生成的一种有效方法,尤其适用于特殊的伦理和实践限制,如儿科。在多中心PEDREF 2.0研究中,我们分析了来自20多家德国三级医疗中心的儿童实验室检测结果、诊断和程序,以建立儿童参考区间。PEDREF 2.0研究使用了德国医学信息学倡议的框架,但具体的研究需要为分布式儿科分析开发一个定制的模块。方法:我们开发了儿科分布式分析、匿名化和聚合模块(PED-DATA),这是一个容器化的应用程序,我们部署到所有参与的中心。PED-DATA将输入数据集转换为统一的内部表示,并使其能够按照数据保护规则进行分散分析,从而产生用于集中分析的匿名输出数据集。结果:在15个中心的初步数据分析中,我们分析了来自753,774名不同患者的52,807,236个实验室检查结果(每个实验室检查结果为323,943至4,338,317),使我们能够以前所未有的精度建立儿科参考区间。结论:PED-DATA以分散和尊重隐私的方式促进了儿科数据驱动的多中心研究的实施,在PEDREF 2.0研究中,它在德国大学医院的使用证明了它在现实世界用例中的有用性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信