{"title":"Best practices for multimodal clinical data management and integration: An atopic dermatitis research case","authors":"Tazro Ohta , Ayaka Hananoe , Ayano Fukushima-Nomura , Koichi Ashizaki , Aiko Sekita , Jun Seita , Eiryo Kawakami , Kazuhiro Sakurada , Masayuki Amagai , Haruhiko Koseki , Hiroshi Kawasaki","doi":"10.1016/j.alit.2023.11.006","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>In clinical research on multifactorial diseases such as atopic dermatitis, data-driven medical research has become more widely used as means to clarify diverse pathological conditions and to realize precision medicine. However, modern clinical data, characterized as large-scale, multimodal, and multi-center, causes difficulties in data integration and management, which limits productivity in clinical data science.</p></div><div><h3>Methods</h3><p>We designed a generic data management flow to collect, cleanse, and integrate data to handle different types of data generated at multiple institutions by 10 types of clinical studies. We developed MeDIA (Medical Data Integration Assistant), a software to browse the data in an integrated manner and extract subsets for analysis.</p></div><div><h3>Results</h3><p>MeDIA integrates and visualizes data and information on research participants obtained from multiple studies. It then provides a sophisticated interface that supports data management and helps data scientists retrieve the data sets they need. Furthermore, the system promotes the use of unified terms such as identifiers or sampling dates to reduce the cost of pre-processing by data analysts. We also propose best practices in clinical data management flow, which we learned from the development and implementation of MeDIA.</p></div><div><h3>Conclusions</h3><p>The MeDIA system solves the problem of multimodal clinical data integration, from complex text data such as medical records to big data such as omics data from a large number of patients. The system and the proposed best practices can be applied not only to allergic diseases but also to other diseases to promote data-driven medical research.</p></div>","PeriodicalId":48861,"journal":{"name":"Allergology International","volume":"73 2","pages":"Pages 255-263"},"PeriodicalIF":6.2000,"publicationDate":"2023-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1323893023001181/pdfft?md5=f56d56da0a26883f820b82ed9109873d&pid=1-s2.0-S1323893023001181-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Allergology International","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1323893023001181","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ALLERGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background
In clinical research on multifactorial diseases such as atopic dermatitis, data-driven medical research has become more widely used as means to clarify diverse pathological conditions and to realize precision medicine. However, modern clinical data, characterized as large-scale, multimodal, and multi-center, causes difficulties in data integration and management, which limits productivity in clinical data science.
Methods
We designed a generic data management flow to collect, cleanse, and integrate data to handle different types of data generated at multiple institutions by 10 types of clinical studies. We developed MeDIA (Medical Data Integration Assistant), a software to browse the data in an integrated manner and extract subsets for analysis.
Results
MeDIA integrates and visualizes data and information on research participants obtained from multiple studies. It then provides a sophisticated interface that supports data management and helps data scientists retrieve the data sets they need. Furthermore, the system promotes the use of unified terms such as identifiers or sampling dates to reduce the cost of pre-processing by data analysts. We also propose best practices in clinical data management flow, which we learned from the development and implementation of MeDIA.
Conclusions
The MeDIA system solves the problem of multimodal clinical data integration, from complex text data such as medical records to big data such as omics data from a large number of patients. The system and the proposed best practices can be applied not only to allergic diseases but also to other diseases to promote data-driven medical research.
背景在特应性皮炎等多因素疾病的临床研究中,数据驱动的医学研究已被越来越广泛地用作阐明各种病理状况和实现精准医疗的手段。然而,现代临床数据具有大规模、多模态、多中心的特点,给数据整合和管理带来了困难,限制了临床数据科学的生产力。我们开发了MeDIA(医学数据整合助手),这是一款以整合方式浏览数据并提取子集进行分析的软件。然后,它提供了一个支持数据管理的复杂界面,帮助数据科学家检索所需的数据集。此外,该系统还提倡使用统一的术语,如标识符或采样日期,以减少数据分析师的预处理成本。我们还提出了临床数据管理流程的最佳实践,这些都是我们在开发和实施 MeDIA 的过程中总结出来的。结论 MeDIA 系统解决了多模态临床数据整合的问题,从复杂的文本数据(如病历)到大数据(如来自大量患者的 omics 数据)。该系统和建议的最佳实践不仅可用于过敏性疾病,还可用于其他疾病,以促进数据驱动的医学研究。
期刊介绍:
Allergology International is the official journal of the Japanese Society of Allergology and publishes original papers dealing with the etiology, diagnosis and treatment of allergic and related diseases. Papers may include the study of methods of controlling allergic reactions, human and animal models of hypersensitivity and other aspects of basic and applied clinical allergy in its broadest sense.
The Journal aims to encourage the international exchange of results and encourages authors from all countries to submit papers in the following three categories: Original Articles, Review Articles, and Letters to the Editor.