Hammam Abu Attieh, Armin Müller, Felix Nikolaus Wirth, Fabian Prasser
{"title":"Pseudonymization tools for medical research: a systematic review.","authors":"Hammam Abu Attieh, Armin Müller, Felix Nikolaus Wirth, Fabian Prasser","doi":"10.1186/s12911-025-02958-0","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Pseudonymization is an important technique for the secure and compliant use of medical data in research. At its core, pseudonymization is a process in which directly identifying information is separated from medical research data. Due to its importance, a wide range of pseudonymization tools and services have been developed, and researchers face the challenge of selecting an appropriate tool for their research projects. This review aims to address this challenge by systematically comparing existing tools.</p><p><strong>Methods: </strong>A systematic review was performed and is reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines where applicable. The search covered PubMed and Web of Science to identify pseudonymization tools documented in the scientific literature. The tools were assessed based on predefined criteria across four key dimensions that describe researchers' requirements: (1) single-center vs. multi-center use, (2) short-term vs. long-term projects, (3) small data vs. big data processing, and (4) integration vs. standalone functionality.</p><p><strong>Results: </strong>From an initial pool of 1,052 papers, 92 were selected for detailed full-text review after the title and abstract screening. This led to the identification of 20 pseudonymization tools, of which 10 met our inclusion criteria and were assessed. The results show that there are differences between the tools that make them more or less suited for research projects differing in regards to the dimensions described above, enabling us to provide targeted recommendations.</p><p><strong>Conclusions: </strong>The landscape of existing pseudonymization tools is heterogeneous, and researchers need to carefully select the appropriate solutions for their research projects. Our findings highlight two Software-as-a-Service-based solutions that enable centralized use without local infrastructure, one tool for retrospective pseudonymization of existing databases, two tools suitable for local deployment in smaller, short-term projects, and two tools well-suited for local deployment in large, multi-center studies.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"128"},"PeriodicalIF":3.3000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11905493/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-02958-0","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Pseudonymization is an important technique for the secure and compliant use of medical data in research. At its core, pseudonymization is a process in which directly identifying information is separated from medical research data. Due to its importance, a wide range of pseudonymization tools and services have been developed, and researchers face the challenge of selecting an appropriate tool for their research projects. This review aims to address this challenge by systematically comparing existing tools.
Methods: A systematic review was performed and is reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines where applicable. The search covered PubMed and Web of Science to identify pseudonymization tools documented in the scientific literature. The tools were assessed based on predefined criteria across four key dimensions that describe researchers' requirements: (1) single-center vs. multi-center use, (2) short-term vs. long-term projects, (3) small data vs. big data processing, and (4) integration vs. standalone functionality.
Results: From an initial pool of 1,052 papers, 92 were selected for detailed full-text review after the title and abstract screening. This led to the identification of 20 pseudonymization tools, of which 10 met our inclusion criteria and were assessed. The results show that there are differences between the tools that make them more or less suited for research projects differing in regards to the dimensions described above, enabling us to provide targeted recommendations.
Conclusions: The landscape of existing pseudonymization tools is heterogeneous, and researchers need to carefully select the appropriate solutions for their research projects. Our findings highlight two Software-as-a-Service-based solutions that enable centralized use without local infrastructure, one tool for retrospective pseudonymization of existing databases, two tools suitable for local deployment in smaller, short-term projects, and two tools well-suited for local deployment in large, multi-center studies.
背景:假名化是研究中安全、合规使用医疗数据的一项重要技术。假名化的核心是将直接识别信息从医学研究数据中分离出来的过程。由于其重要性,各种各样的假名化工具和服务已经开发出来,研究人员面临着为他们的研究项目选择合适工具的挑战。本综述旨在通过系统地比较现有工具来解决这一挑战。方法:进行系统评价,并根据适用的系统评价和荟萃分析(PRISMA)指南的首选报告项目进行报告。搜索包括PubMed和Web of Science,以确定科学文献中记录的假名工具。这些工具是根据描述研究人员需求的四个关键维度的预定义标准进行评估的:(1)单中心与多中心使用,(2)短期与长期项目,(3)小数据与大数据处理,(4)集成与独立功能。结果:从最初的1052篇论文中,经过标题和摘要筛选,选择了92篇论文进行详细的全文审查。这导致鉴定出20个假名化工具,其中10个符合我们的纳入标准并进行了评估。结果表明,这些工具之间存在差异,这些差异使它们或多或少适合于上述不同维度的研究项目,使我们能够提供有针对性的建议。结论:现有的假名化工具的景观是异质的,研究人员需要仔细选择合适的解决方案,为他们的研究项目。我们的研究结果突出了两种基于软件即服务的解决方案,它们可以在没有本地基础设施的情况下集中使用,一种工具用于对现有数据库进行回顾性假名化,两种工具适合在较小的短期项目中进行本地部署,另外两种工具非常适合在大型多中心研究中进行本地部署。
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.