Mike Du, Albert Prats-Uribe, Núria Mercadé-Besora, Kim Lopez-Guell, Yuchen Guo, Marta Alcalde-Herraiz, Xihang Chen, Antonella Delmestri, Wai Yi Man, Talita Duarte-Salles, Anna Palomar, Agustina Giuliodori, Emanuel Brađašević, Antea Jezidžić, Elvira Bräuner, Susanne Bruun, Katia Verhamme, Mees Mosseveld, James T Brash, Dina Vojinovic, Isabella Kaczmarczyk, Akram Mendez, Peter Rijnbeek, Daniel Prieto-Alhambra, Edward Burn, Martí Català
{"title":"CohortCharacteristics: an R package for population characterisation in observational studies using the OMOP common data model.","authors":"Mike Du, Albert Prats-Uribe, Núria Mercadé-Besora, Kim Lopez-Guell, Yuchen Guo, Marta Alcalde-Herraiz, Xihang Chen, Antonella Delmestri, Wai Yi Man, Talita Duarte-Salles, Anna Palomar, Agustina Giuliodori, Emanuel Brađašević, Antea Jezidžić, Elvira Bräuner, Susanne Bruun, Katia Verhamme, Mees Mosseveld, James T Brash, Dina Vojinovic, Isabella Kaczmarczyk, Akram Mendez, Peter Rijnbeek, Daniel Prieto-Alhambra, Edward Burn, Martí Català","doi":"10.1007/s10654-025-01352-4","DOIUrl":null,"url":null,"abstract":"<p><p>Describing cohort characterisation ensures comparability and reproducibility in multi-database observational studies. To address this need, we developed CohortCharacteristics, an open-source R package that facilitates standardised cohort characterisation in datasets mapped to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). This study aims to explain the development of the package and demonstrate its core functionality. We developed CohortCharacteristics, an open-source R package that can perform cohort characterisation for various types of databases. To demonstrate its functionality, we then used CohortCharacteristics to generate descriptive statistics on demographics, comorbidities, medication exposures, cohort overlap, and timing of cohort entries. The study included data from CPRD GOLD (UK), DK-DHR (Denmark), IPCI (Netherlands), IQVIA Longitudinal Patient Database Belgium (IQVIA LPD Belgium), IQVIA DA Germany, NAJS (Croatia), and SIDIAP (Spain), all mapped to the OMOP CDM. The CohortCharacteristics R package is freely available on CRAN with detailed vignettes and documentation on its functionality. Cohort characteristics were generally consistent across databases, with similar age distributions and female representation. CPRD GOLD, NAJS, and SIDIAP exhibited higher prescribing rates for respiratory, cardiovascular, and nervous system medications, while IQVIA databases and DK-DHR reported lower rates. Timing analysis showed that dementia diagnoses typically followed insomnia diagnoses in several databases, supporting existing literature. Antipsychotic prescriptions often occurred after dementia diagnosis, reflecting prescribing practices aligned with clinical guidelines. CohortCharacteristics enables consistent cohort characterisation across a network of data mapped to the OMOP CDM, thereby improving transparency in multi-database research. The package's functionality, demonstrated in this study, illustrates its applicability in observational studies with OMOP CDM data.</p>","PeriodicalId":11907,"journal":{"name":"European Journal of Epidemiology","volume":" ","pages":""},"PeriodicalIF":5.9000,"publicationDate":"2026-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10654-025-01352-4","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Describing cohort characterisation ensures comparability and reproducibility in multi-database observational studies. To address this need, we developed CohortCharacteristics, an open-source R package that facilitates standardised cohort characterisation in datasets mapped to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). This study aims to explain the development of the package and demonstrate its core functionality. We developed CohortCharacteristics, an open-source R package that can perform cohort characterisation for various types of databases. To demonstrate its functionality, we then used CohortCharacteristics to generate descriptive statistics on demographics, comorbidities, medication exposures, cohort overlap, and timing of cohort entries. The study included data from CPRD GOLD (UK), DK-DHR (Denmark), IPCI (Netherlands), IQVIA Longitudinal Patient Database Belgium (IQVIA LPD Belgium), IQVIA DA Germany, NAJS (Croatia), and SIDIAP (Spain), all mapped to the OMOP CDM. The CohortCharacteristics R package is freely available on CRAN with detailed vignettes and documentation on its functionality. Cohort characteristics were generally consistent across databases, with similar age distributions and female representation. CPRD GOLD, NAJS, and SIDIAP exhibited higher prescribing rates for respiratory, cardiovascular, and nervous system medications, while IQVIA databases and DK-DHR reported lower rates. Timing analysis showed that dementia diagnoses typically followed insomnia diagnoses in several databases, supporting existing literature. Antipsychotic prescriptions often occurred after dementia diagnosis, reflecting prescribing practices aligned with clinical guidelines. CohortCharacteristics enables consistent cohort characterisation across a network of data mapped to the OMOP CDM, thereby improving transparency in multi-database research. The package's functionality, demonstrated in this study, illustrates its applicability in observational studies with OMOP CDM data.
期刊介绍:
The European Journal of Epidemiology, established in 1985, is a peer-reviewed publication that provides a platform for discussions on epidemiology in its broadest sense. It covers various aspects of epidemiologic research and statistical methods. The journal facilitates communication between researchers, educators, and practitioners in epidemiology, including those in clinical and community medicine. Contributions from diverse fields such as public health, preventive medicine, clinical medicine, health economics, and computational biology and data science, in relation to health and disease, are encouraged. While accepting submissions from all over the world, the journal particularly emphasizes European topics relevant to epidemiology. The published articles consist of empirical research findings, developments in methodology, and opinion pieces.