COVID-19 real world data infrastructure: A big data resource for study of the impact of COVID-19 in patient populations with immunocompromising conditions
James M Crawford, Lynne Penberthy, Ligia A Pinto, Keri N Althoff, Magdalene M Assimon, Oren Cohen, Laura Gillim, Tracy L Hammonds, Shilpa Kapur, Harvey W Kaufman, David Kwasny, Jean W Liew, William A Meyer, Shannon L Reynolds, Cheryl B Schleicher, Suki Subbiah, Catherine Theruviparampil, Zachary S Wallace, Jeremy L Warner, Nicole Yoon, Yonah C Ziemba
{"title":"COVID-19 real world data infrastructure: A big data resource for study of the impact of COVID-19 in patient populations with immunocompromising conditions","authors":"James M Crawford, Lynne Penberthy, Ligia A Pinto, Keri N Althoff, Magdalene M Assimon, Oren Cohen, Laura Gillim, Tracy L Hammonds, Shilpa Kapur, Harvey W Kaufman, David Kwasny, Jean W Liew, William A Meyer, Shannon L Reynolds, Cheryl B Schleicher, Suki Subbiah, Catherine Theruviparampil, Zachary S Wallace, Jeremy L Warner, Nicole Yoon, Yonah C Ziemba","doi":"10.1101/2024.09.08.24313270","DOIUrl":null,"url":null,"abstract":"Background: We created a United States-based real-world data resource to better understand the continued impact of the COVID-19 pandemic on immunocompromised patients, who are typically under-represented in prospective studies and clinical trials. Methods: The COVID-19 Real World Data infrastructure (CRWDi) was created by linking and harmonizing deidentified HealthVerity medical and pharmacy claims data from December 1, 2018 to December 31, 2023, with SARS-CoV-2 virologic and serologic laboratory data from major commercial laboratories and Northwell Health; COVID-19 vaccination data; and for patients with cancer, 2010 to 2021 National Cancer Institute Surveillance, Epidemiology, and End Results registry data. Results: The CRWDi dataset contains data on 5.2 million people. Four populations were included in the dataset: (1) patients with cancer (n=1,294,022); (2) patients with rheumatic conditions receiving pharmacotherapy (n=1,636,940); (3) non-cancer solid organ (n=249,797) and hematopoietic stem cell (n=30,172) transplant recipients; and (4) people from the general population including adults (>18 years of age; n=1,790,162) and pediatric patients (<18 years of age; n=198,907). Conclusions: We have created a complex real-world data system to address unanswered questions that have arisen during the COVID-19 pandemic. Further, by making the data broadly and freely available to academic researchers from the United States, the CRWDi real-world data system represents an important complement to existing consortia studies and clinical trials that have emerged during the healthcare crisis, and is readily reproducible for future purposing.","PeriodicalId":501509,"journal":{"name":"medRxiv - Infectious Diseases","volume":"394 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Infectious Diseases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.08.24313270","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: We created a United States-based real-world data resource to better understand the continued impact of the COVID-19 pandemic on immunocompromised patients, who are typically under-represented in prospective studies and clinical trials. Methods: The COVID-19 Real World Data infrastructure (CRWDi) was created by linking and harmonizing deidentified HealthVerity medical and pharmacy claims data from December 1, 2018 to December 31, 2023, with SARS-CoV-2 virologic and serologic laboratory data from major commercial laboratories and Northwell Health; COVID-19 vaccination data; and for patients with cancer, 2010 to 2021 National Cancer Institute Surveillance, Epidemiology, and End Results registry data. Results: The CRWDi dataset contains data on 5.2 million people. Four populations were included in the dataset: (1) patients with cancer (n=1,294,022); (2) patients with rheumatic conditions receiving pharmacotherapy (n=1,636,940); (3) non-cancer solid organ (n=249,797) and hematopoietic stem cell (n=30,172) transplant recipients; and (4) people from the general population including adults (>18 years of age; n=1,790,162) and pediatric patients (<18 years of age; n=198,907). Conclusions: We have created a complex real-world data system to address unanswered questions that have arisen during the COVID-19 pandemic. Further, by making the data broadly and freely available to academic researchers from the United States, the CRWDi real-world data system represents an important complement to existing consortia studies and clinical trials that have emerged during the healthcare crisis, and is readily reproducible for future purposing.