标准化调查数据收集提高再现性：再现模式生态系统的发展与比较评价。

IF 5.8 2区医学 Q1 HEALTH CARE SCIENCES & SERVICES

Journal of Medical Internet Research Pub Date : 2025-07-11 DOI:10.2196/63343

Yibei Chen, Dorota Jarecka, Sanu Ann Abraham, Remi Gau, Evan Ng, Daniel M Low, Isaac Bevers, Alistair Johnson, Anisha Keshavan, Arno Klein, Jon Clucas, Zaliqa Rosli, Steven M Hodge, Janosch Linkersdörfer, Hauke Bartsch, Samir Das, Damien Fair, David Kennedy, Satrajit S Ghosh

{"title":"标准化调查数据收集提高再现性：再现模式生态系统的发展与比较评价。","authors":"Yibei Chen, Dorota Jarecka, Sanu Ann Abraham, Remi Gau, Evan Ng, Daniel M Low, Isaac Bevers, Alistair Johnson, Anisha Keshavan, Arno Klein, Jon Clucas, Zaliqa Rosli, Steven M Hodge, Janosch Linkersdörfer, Hauke Bartsch, Samir Das, Damien Fair, David Kennedy, Satrajit S Ghosh","doi":"10.2196/63343","DOIUrl":null,"url":null,"abstract":"Background: Inconsistencies in survey-based (eg, questionnaire) data collection across biomedical, clinical, behavioral, and social sciences pose challenges to research reproducibility. ReproSchema is an ecosystem that standardizes survey design and facilitates reproducible data collection through a schema-centric framework, a library of reusable assessments, and computational tools for validation and conversion. Unlike conventional survey platforms that primarily offer graphical user interface-based survey creation, ReproSchema provides a structured, modular approach for defining and managing survey components, enabling interoperability and adaptability across diverse research settings.Objective: This study examines ReproSchema's role in enhancing research reproducibility and reliability. We introduce its conceptual and practical foundations, compare it against 12 platforms to assess its effectiveness in addressing inconsistencies in data collection, and demonstrate its application through 3 use cases: standardizing required mental health survey common data elements, tracking changes in longitudinal data collection, and creating interactive checklists for neuroimaging research.Methods: We describe ReproSchema's core components, including its schema-based design; reusable assessment library with >90 assessments; and tools to validate data, convert survey formats (eg, REDCap [Research Electronic Data Capture] and Fast Healthcare Interoperability Resources), and build protocols. We compared 12 platforms-Center for Expanded Data Annotation and Retrieval, formr, KoboToolbox, Longitudinal Online Research and Imaging System, MindLogger, OpenClinica, Pavlovia, PsyToolkit, Qualtrics, REDCap, SurveyCTO, and SurveyMonkey-against 14 findability, accessibility, interoperability, and reusability (FAIR) principles and assessed their support of 8 survey functionalities (eg, multilingual support and automated scoring). Finally, we applied ReproSchema to 3 use cases-NIMH-Minimal, the Adolescent Brain Cognitive Development and HEALthy Brain and Child Development Studies, and the Committee on Best Practices in Data Analysis and Sharing Checklist-to illustrate ReproSchema's versatility.Results: ReproSchema provides a structured framework for standardizing survey-based data collection while ensuring compatibility with existing survey tools. Our comparison results showed that ReproSchema met 14 of 14 FAIR criteria and supported 6 of 8 key survey functionalities: provision of standardized assessments, multilingual support, multimedia integration, data validation, advanced branching logic, and automated scoring. Three use cases illustrating ReproSchema's flexibility include standardizing essential mental health assessments (NIMH-Minimal), systematically tracking changes in longitudinal studies (Adolescent Brain Cognitive Development and HEALthy Brain and Child Development), and converting a 71-page neuroimaging best practices guide into an interactive checklist (Committee on Best Practices in Data Analysis and Sharing).Conclusions: ReproSchema enhances reproducibility by structuring survey-based data collection through a structured, schema-driven approach. It integrates version control, manages metadata, and ensures interoperability, maintaining consistency across studies and compatibility with common survey tools. Planned developments, including ontology mappings and semantic search, will broaden its use, supporting transparent, scalable, and reproducible research across disciplines.","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e63343"},"PeriodicalIF":5.8000,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Standardizing Survey Data Collection to Enhance Reproducibility: Development and Comparative Evaluation of the ReproSchema Ecosystem.\",\"authors\":\"Yibei Chen, Dorota Jarecka, Sanu Ann Abraham, Remi Gau, Evan Ng, Daniel M Low, Isaac Bevers, Alistair Johnson, Anisha Keshavan, Arno Klein, Jon Clucas, Zaliqa Rosli, Steven M Hodge, Janosch Linkersdörfer, Hauke Bartsch, Samir Das, Damien Fair, David Kennedy, Satrajit S Ghosh\",\"doi\":\"10.2196/63343\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Inconsistencies in survey-based (eg, questionnaire) data collection across biomedical, clinical, behavioral, and social sciences pose challenges to research reproducibility. ReproSchema is an ecosystem that standardizes survey design and facilitates reproducible data collection through a schema-centric framework, a library of reusable assessments, and computational tools for validation and conversion. Unlike conventional survey platforms that primarily offer graphical user interface-based survey creation, ReproSchema provides a structured, modular approach for defining and managing survey components, enabling interoperability and adaptability across diverse research settings.Objective: This study examines ReproSchema's role in enhancing research reproducibility and reliability. We introduce its conceptual and practical foundations, compare it against 12 platforms to assess its effectiveness in addressing inconsistencies in data collection, and demonstrate its application through 3 use cases: standardizing required mental health survey common data elements, tracking changes in longitudinal data collection, and creating interactive checklists for neuroimaging research.Methods: We describe ReproSchema's core components, including its schema-based design; reusable assessment library with >90 assessments; and tools to validate data, convert survey formats (eg, REDCap [Research Electronic Data Capture] and Fast Healthcare Interoperability Resources), and build protocols. We compared 12 platforms-Center for Expanded Data Annotation and Retrieval, formr, KoboToolbox, Longitudinal Online Research and Imaging System, MindLogger, OpenClinica, Pavlovia, PsyToolkit, Qualtrics, REDCap, SurveyCTO, and SurveyMonkey-against 14 findability, accessibility, interoperability, and reusability (FAIR) principles and assessed their support of 8 survey functionalities (eg, multilingual support and automated scoring). Finally, we applied ReproSchema to 3 use cases-NIMH-Minimal, the Adolescent Brain Cognitive Development and HEALthy Brain and Child Development Studies, and the Committee on Best Practices in Data Analysis and Sharing Checklist-to illustrate ReproSchema's versatility.Results: ReproSchema provides a structured framework for standardizing survey-based data collection while ensuring compatibility with existing survey tools. Our comparison results showed that ReproSchema met 14 of 14 FAIR criteria and supported 6 of 8 key survey functionalities: provision of standardized assessments, multilingual support, multimedia integration, data validation, advanced branching logic, and automated scoring. Three use cases illustrating ReproSchema's flexibility include standardizing essential mental health assessments (NIMH-Minimal), systematically tracking changes in longitudinal studies (Adolescent Brain Cognitive Development and HEALthy Brain and Child Development), and converting a 71-page neuroimaging best practices guide into an interactive checklist (Committee on Best Practices in Data Analysis and Sharing).Conclusions: ReproSchema enhances reproducibility by structuring survey-based data collection through a structured, schema-driven approach. It integrates version control, manages metadata, and ensures interoperability, maintaining consistency across studies and compatibility with common survey tools. Planned developments, including ontology mappings and semantic search, will broaden its use, supporting transparent, scalable, and reproducible research across disciplines.\",\"PeriodicalId\":16337,\"journal\":{\"name\":\"Journal of Medical Internet Research\",\"volume\":\"27 \",\"pages\":\"e63343\"},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2025-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Medical Internet Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.2196/63343\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Internet Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/63343","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

摘要

背景：生物医学、临床、行为和社会科学中基于调查（如问卷）的数据收集的不一致性对研究的可重复性提出了挑战。rereproschema是一个生态系统，它标准化调查设计，并通过以模式为中心的框架、可重用评估库和用于验证和转换的计算工具，促进可重复的数据收集。与主要提供基于图形用户界面的调查创建的传统调查平台不同，repschema提供了一种结构化的、模块化的方法来定义和管理调查组件，实现跨不同研究设置的互操作性和适应性。目的：探讨再现图式在提高研究可重复性和可靠性中的作用。我们介绍了其概念和实践基础，将其与12个平台进行比较，以评估其在解决数据收集不一致性方面的有效性，并通过3个用例展示其应用：标准化所需的心理健康调查常见数据元素，跟踪纵向数据收集的变化，以及为神经影像学研究创建交互式清单。方法：描述了reschema的核心组件，包括基于模式的设计；具有bbb90评估的可重用评估库；以及用于验证数据、转换调查格式（例如REDCap[研究电子数据捕获]和快速医疗保健互操作性资源]）和构建协议的工具。我们比较了12个平台——扩展数据注释和检索中心、formr、KoboToolbox、纵向在线研究和成像系统、MindLogger、OpenClinica、Pavlovia、PsyToolkit、qualics、REDCap、SurveyCTO和surveymonkey——针对14项可查找性、可访问性、互操作性和可重用性（FAIR）原则，并评估了它们对8项调查功能（例如，多语言支持和自动评分）的支持。最后，我们将re图式应用于3个用例——nimh - minimal，青少年大脑认知发展和健康大脑和儿童发展研究，以及数据分析和共享清单最佳实践委员会——以说明re图式的多功能性。结果：reschema提供了一个结构化框架，用于标准化基于调查的数据收集，同时确保与现有调查工具的兼容性。我们的比较结果表明，reschema满足14个FAIR标准中的14个，并支持8个关键调查功能中的6个：提供标准化评估、多语言支持、多媒体集成、数据验证、高级分支逻辑和自动评分。再现图式灵活性的三个用例包括标准化基本心理健康评估（NIMH-Minimal），系统地跟踪纵向研究的变化（青少年大脑认知发展和健康大脑和儿童发展），以及将71页的神经成像最佳实践指南转换为交互式清单（数据分析和共享最佳实践委员会）。结论：reschema通过结构化的、模式驱动的方法来结构化基于调查的数据收集，从而增强了再现性。它集成了版本控制，管理元数据，并确保互操作性，维护研究之间的一致性以及与通用调查工具的兼容性。计划中的开发，包括本体映射和语义搜索，将扩大其使用范围，支持透明的、可扩展的和可重复的跨学科研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Standardizing Survey Data Collection to Enhance Reproducibility: Development and Comparative Evaluation of the ReproSchema Ecosystem.

Background: Inconsistencies in survey-based (eg, questionnaire) data collection across biomedical, clinical, behavioral, and social sciences pose challenges to research reproducibility. ReproSchema is an ecosystem that standardizes survey design and facilitates reproducible data collection through a schema-centric framework, a library of reusable assessments, and computational tools for validation and conversion. Unlike conventional survey platforms that primarily offer graphical user interface-based survey creation, ReproSchema provides a structured, modular approach for defining and managing survey components, enabling interoperability and adaptability across diverse research settings.

Objective: This study examines ReproSchema's role in enhancing research reproducibility and reliability. We introduce its conceptual and practical foundations, compare it against 12 platforms to assess its effectiveness in addressing inconsistencies in data collection, and demonstrate its application through 3 use cases: standardizing required mental health survey common data elements, tracking changes in longitudinal data collection, and creating interactive checklists for neuroimaging research.

Methods: We describe ReproSchema's core components, including its schema-based design; reusable assessment library with >90 assessments; and tools to validate data, convert survey formats (eg, REDCap [Research Electronic Data Capture] and Fast Healthcare Interoperability Resources), and build protocols. We compared 12 platforms-Center for Expanded Data Annotation and Retrieval, formr, KoboToolbox, Longitudinal Online Research and Imaging System, MindLogger, OpenClinica, Pavlovia, PsyToolkit, Qualtrics, REDCap, SurveyCTO, and SurveyMonkey-against 14 findability, accessibility, interoperability, and reusability (FAIR) principles and assessed their support of 8 survey functionalities (eg, multilingual support and automated scoring). Finally, we applied ReproSchema to 3 use cases-NIMH-Minimal, the Adolescent Brain Cognitive Development and HEALthy Brain and Child Development Studies, and the Committee on Best Practices in Data Analysis and Sharing Checklist-to illustrate ReproSchema's versatility.

Results: ReproSchema provides a structured framework for standardizing survey-based data collection while ensuring compatibility with existing survey tools. Our comparison results showed that ReproSchema met 14 of 14 FAIR criteria and supported 6 of 8 key survey functionalities: provision of standardized assessments, multilingual support, multimedia integration, data validation, advanced branching logic, and automated scoring. Three use cases illustrating ReproSchema's flexibility include standardizing essential mental health assessments (NIMH-Minimal), systematically tracking changes in longitudinal studies (Adolescent Brain Cognitive Development and HEALthy Brain and Child Development), and converting a 71-page neuroimaging best practices guide into an interactive checklist (Committee on Best Practices in Data Analysis and Sharing).

Conclusions: ReproSchema enhances reproducibility by structuring survey-based data collection through a structured, schema-driven approach. It integrates version control, manages metadata, and ensures interoperability, maintaining consistency across studies and compatibility with common survey tools. Planned developments, including ontology mappings and semantic search, will broaden its use, supporting transparent, scalable, and reproducible research across disciplines.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Medical Internet Research 医学-卫生保健

CiteScore

14.40

自引率

5.40%

发文量

654

审稿时长

1 months

期刊介绍： The Journal of Medical Internet Research (JMIR) is a highly respected publication in the field of health informatics and health services. With a founding date in 1999, JMIR has been a pioneer in the field for over two decades. As a leader in the industry, the journal focuses on digital health, data science, health informatics, and emerging technologies for health, medicine, and biomedical research. It is recognized as a top publication in these disciplines, ranking in the first quartile (Q1) by Impact Factor. Notably, JMIR holds the prestigious position of being ranked #1 on Google Scholar within the "Medical Informatics" discipline.