Toward a Domain-Overarching Metadata Schema for Making Health Research Studies FAIR (Findable, Accessible, Interoperable, and Reusable): Development of the NFDI4Health Metadata Schema.
Haitham Abaza, Aliaksandra Shutsko, Sophie A I Klopfenstein, Carina N Vorisek, Carsten Oliver Schmidt, Claudia Brünings-Kuppe, Vera Clemens, Johannes Darms, Sabine Hanß, Timm Intemann, Franziska Jannasch, Elisa Kasbohm, Birte Lindstädt, Matthias Löbe, Katharina Nimptsch, Ute Nöthlings, Marisabel Gonzalez Ocanto, Tracy Bonsu Osei, Ines Perrar, Manuela Peters, Tobias Pischon, Ulrich Sax, Matthias B Schulze, Florian Schwarz, Carolina Schwedhelm, Sylvia Thun, Dagmar Waltemath, Hannes Wünsche, Atinkut A Zeleke, Wolfgang Müller, Martin Golebiewski
{"title":"Toward a Domain-Overarching Metadata Schema for Making Health Research Studies FAIR (Findable, Accessible, Interoperable, and Reusable): Development of the NFDI4Health Metadata Schema.","authors":"Haitham Abaza, Aliaksandra Shutsko, Sophie A I Klopfenstein, Carina N Vorisek, Carsten Oliver Schmidt, Claudia Brünings-Kuppe, Vera Clemens, Johannes Darms, Sabine Hanß, Timm Intemann, Franziska Jannasch, Elisa Kasbohm, Birte Lindstädt, Matthias Löbe, Katharina Nimptsch, Ute Nöthlings, Marisabel Gonzalez Ocanto, Tracy Bonsu Osei, Ines Perrar, Manuela Peters, Tobias Pischon, Ulrich Sax, Matthias B Schulze, Florian Schwarz, Carolina Schwedhelm, Sylvia Thun, Dagmar Waltemath, Hannes Wünsche, Atinkut A Zeleke, Wolfgang Müller, Martin Golebiewski","doi":"10.2196/63906","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Despite wide acceptance in medical research, implementation of the FAIR (findability, accessibility, interoperability, and reusability) principles in certain health domains and interoperability across data sources remain a challenge. While clinical trial registries collect metadata about clinical studies, numerous epidemiological and public health studies remain unregistered or lack detailed information about relevant study documents. Making valuable data from these studies available to the research community could improve our understanding of various diseases and their risk factors. The National Research Data Infrastructure for Personal Health Data (NFDI4Health) seeks to optimize data sharing among the clinical, epidemiological, and public health research communities while preserving privacy and ethical regulations.</p><p><strong>Objective: </strong>We aimed to develop a tailored metadata schema (MDS) to support the standardized publication of health studies' metadata in NFDI4Health services and beyond. This study describes the development, structure, and implementation of this MDS designed to improve the FAIRness of metadata from clinical, epidemiological, and public health research while maintaining compatibility with metadata models of other resources to ease interoperability.</p><p><strong>Methods: </strong>Based on the models of DataCite, ClinicalTrials.gov, and other data models and international standards, the first MDS version was developed by the NFDI4Health Task Force COVID-19. It was later extended in a modular fashion, combining generic and NFDI4Health use case-specific metadata items relevant to domains of nutritional epidemiology, chronic diseases, and record linkage. Mappings to schemas of clinical trial registries and international and local initiatives were performed to enable interfacing with external resources. The MDS is represented in Microsoft Excel spreadsheets. A transformation into an improved and interactive machine-readable format was completed using the ART-DECOR (Advanced Requirement Tooling-Data Elements, Codes, OIDs, and Rules) tool to facilitate editing, maintenance, and versioning.</p><p><strong>Results: </strong>The MDS is implemented in NFDI4Health services (eg, the German Central Health Study Hub and the Local Data Hub) to structure and exchange study-related metadata. Its current version (3.3) comprises 220 metadata items in 5 modules. The core and design modules cover generic metadata, including bibliographic information, study design details, and data access information. Domain-specific metadata are included in use case-specific modules, currently comprising nutritional epidemiology, chronic diseases, and record linkage. All modules incorporate mandatory, optional, and conditional items. Mappings to the schemas of clinical trial registries and other resources enable integrating their study metadata in the NFDI4Health services. The current MDS version is available in both Excel and ART-DECOR formats.</p><p><strong>Conclusions: </strong>With its implementation in the German Central Health Study Hub and the Local Data Hub, the MDS improves the FAIRness of data from clinical, epidemiological, and public health research. Due to its generic nature and interoperability through mappings to other schemas, it is transferable to services from adjacent domains, making it useful for a broader user community.</p>","PeriodicalId":56334,"journal":{"name":"JMIR Medical Informatics","volume":"13 ","pages":"e63906"},"PeriodicalIF":3.1000,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/63906","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Despite wide acceptance in medical research, implementation of the FAIR (findability, accessibility, interoperability, and reusability) principles in certain health domains and interoperability across data sources remain a challenge. While clinical trial registries collect metadata about clinical studies, numerous epidemiological and public health studies remain unregistered or lack detailed information about relevant study documents. Making valuable data from these studies available to the research community could improve our understanding of various diseases and their risk factors. The National Research Data Infrastructure for Personal Health Data (NFDI4Health) seeks to optimize data sharing among the clinical, epidemiological, and public health research communities while preserving privacy and ethical regulations.
Objective: We aimed to develop a tailored metadata schema (MDS) to support the standardized publication of health studies' metadata in NFDI4Health services and beyond. This study describes the development, structure, and implementation of this MDS designed to improve the FAIRness of metadata from clinical, epidemiological, and public health research while maintaining compatibility with metadata models of other resources to ease interoperability.
Methods: Based on the models of DataCite, ClinicalTrials.gov, and other data models and international standards, the first MDS version was developed by the NFDI4Health Task Force COVID-19. It was later extended in a modular fashion, combining generic and NFDI4Health use case-specific metadata items relevant to domains of nutritional epidemiology, chronic diseases, and record linkage. Mappings to schemas of clinical trial registries and international and local initiatives were performed to enable interfacing with external resources. The MDS is represented in Microsoft Excel spreadsheets. A transformation into an improved and interactive machine-readable format was completed using the ART-DECOR (Advanced Requirement Tooling-Data Elements, Codes, OIDs, and Rules) tool to facilitate editing, maintenance, and versioning.
Results: The MDS is implemented in NFDI4Health services (eg, the German Central Health Study Hub and the Local Data Hub) to structure and exchange study-related metadata. Its current version (3.3) comprises 220 metadata items in 5 modules. The core and design modules cover generic metadata, including bibliographic information, study design details, and data access information. Domain-specific metadata are included in use case-specific modules, currently comprising nutritional epidemiology, chronic diseases, and record linkage. All modules incorporate mandatory, optional, and conditional items. Mappings to the schemas of clinical trial registries and other resources enable integrating their study metadata in the NFDI4Health services. The current MDS version is available in both Excel and ART-DECOR formats.
Conclusions: With its implementation in the German Central Health Study Hub and the Local Data Hub, the MDS improves the FAIRness of data from clinical, epidemiological, and public health research. Due to its generic nature and interoperability through mappings to other schemas, it is transferable to services from adjacent domains, making it useful for a broader user community.
背景:尽管在医学研究中被广泛接受,但在某些卫生领域实施FAIR(可查找性、可访问性、互操作性和可重用性)原则以及跨数据源的互操作性仍然是一个挑战。虽然临床试验登记处收集有关临床研究的元数据,但许多流行病学和公共卫生研究仍未注册或缺乏有关研究文件的详细信息。将这些研究的宝贵数据提供给研究界,可以提高我们对各种疾病及其风险因素的理解。国家个人健康数据研究数据基础设施(NFDI4Health)旨在优化临床、流行病学和公共卫生研究界之间的数据共享,同时保护隐私和道德法规。目的:我们旨在开发一个定制的元数据模式(MDS),以支持NFDI4Health服务及其他健康研究元数据的标准化发布。本研究描述了该MDS的开发、结构和实现,旨在提高临床、流行病学和公共卫生研究的元数据的公平性,同时保持与其他资源的元数据模型的兼容性,以简化互操作性。方法:基于DataCite、ClinicalTrials.gov等数据模型和国际标准,由NFDI4Health Task Force COVID-19开发首个MDS版本。它后来以模块化方式进行了扩展,结合了与营养流行病学、慢性病和记录链接领域相关的通用和NFDI4Health用例特定元数据项。对临床试验注册表和国际和地方倡议进行映射,以实现与外部资源的接口。MDS用Microsoft Excel电子表格表示。使用ART-DECOR(高级需求工具—数据元素、代码、oid和规则)工具完成了向改进的交互式机器可读格式的转换,以方便编辑、维护和版本控制。结果:MDS在NFDI4Health服务(例如,德国中央健康研究中心和本地数据中心)中实施,以构建和交换与研究相关的元数据。其当前版本(3.3)包含5个模块中的220个元数据项。核心模块和设计模块涵盖了通用元数据,包括书目信息、研究设计细节和数据访问信息。特定领域的元数据包括在特定用例模块中,目前包括营养流行病学、慢性病和记录链接。所有模块都包含必选、可选和条件项。通过映射到临床试验注册中心和其他资源的模式,可以将其研究元数据集成到NFDI4Health服务中。当前的MDS版本有Excel和ART-DECOR两种格式。结论:随着MDS在德国中央卫生研究中心和地方数据中心的实施,MDS提高了临床、流行病学和公共卫生研究数据的公平性。由于其通用性和通过映射到其他模式的互操作性,它可以转移到邻近域的服务中,使其对更广泛的用户社区有用。
期刊介绍:
JMIR Medical Informatics (JMI, ISSN 2291-9694) is a top-rated, tier A journal which focuses on clinical informatics, big data in health and health care, decision support for health professionals, electronic health records, ehealth infrastructures and implementation. It has a focus on applied, translational research, with a broad readership including clinicians, CIOs, engineers, industry and health informatics professionals.
Published by JMIR Publications, publisher of the Journal of Medical Internet Research (JMIR), the leading eHealth/mHealth journal (Impact Factor 2016: 5.175), JMIR Med Inform has a slightly different scope (emphasizing more on applications for clinicians and health professionals rather than consumers/citizens, which is the focus of JMIR), publishes even faster, and also allows papers which are more technical or more formative than what would be published in the Journal of Medical Internet Research.