The Complete Inpatient Record Using Comprehensive Electronic Data (CIRCE) project: A team-based approach to clinically validated, research-ready electronic health record data
Andrea L. C. Schneider, Jennifer C. Ginestra, Meeta Prasad Kerlin, Michael G. S. Shashaty, Todd A. Miano, Daniel S. Herman, Oscar J. L. Mitchell, Rachel Bennett, Alexander T. Moffett, John Chandler, Atul Kalanuria, Zahra Faraji, Nicholas S. Bishop, Benjamin Schmid, Angela T. Chen, Kathryn H. Bowles, Thomas Joseph, Rachel Kohn, Rachel R. Kelz, George L. Anesi, Monisha Kumar, Ari B. Friedman, Emily Vail, Nuala J. Meyer, Blanca E. Himes, Gary E. Weissman
{"title":"The Complete Inpatient Record Using Comprehensive Electronic Data (CIRCE) project: A team-based approach to clinically validated, research-ready electronic health record data","authors":"Andrea L. C. Schneider, Jennifer C. Ginestra, Meeta Prasad Kerlin, Michael G. S. Shashaty, Todd A. Miano, Daniel S. Herman, Oscar J. L. Mitchell, Rachel Bennett, Alexander T. Moffett, John Chandler, Atul Kalanuria, Zahra Faraji, Nicholas S. Bishop, Benjamin Schmid, Angela T. Chen, Kathryn H. Bowles, Thomas Joseph, Rachel Kohn, Rachel R. Kelz, George L. Anesi, Monisha Kumar, Ari B. Friedman, Emily Vail, Nuala J. Meyer, Blanca E. Himes, Gary E. Weissman","doi":"10.1002/lrh2.10439","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Introduction</h3>\n \n <p>The rapid adoption of electronic health record (EHR) systems has resulted in extensive archives of data relevant to clinical research, hospital operations, and the development of learning health systems. However, EHR data are not frequently available, cleaned, standardized, validated, and ready for use by stakeholders. We describe an in-progress effort to overcome these challenges with cooperative, systematic data extraction and validation.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>A multi-disciplinary team of investigators collaborated to create the Complete Inpatient Record Using Comprehensive Electronic Data (CIRCE) Project dataset, which captures EHR data from six hospitals within the University of Pennsylvania Health System. Analysts and clinical researchers jointly iteratively reviewed SQL queries and their output to validate desired data elements. Data from patients aged ≥18 years with at least one encounter at an acute care hospital or hospice occurring since 7/1/2017 were included. The CIRCE Project includes three layers: (1) raw data comprised of direct SQL query output, (2) cleaned data with errors removed, and (3) transformed data with standardized implementations of commonly used case definitions and clinical scores.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>Between July 1, 2017 and December 31, 2023, the dataset captured 1 629 920 encounters from 740 035 patients. Most encounters were emergency department only visits (<i>n</i> = 965 834, 59.3%), followed by inpatient admissions without an intensive care unit admission (<i>n</i> = 518 367, 23.7%). The median age was 46.9 years (25th–75th percentiles = 31.1–64.7) at the time of the first encounter. Most patients were female (<i>n</i> = 418 303, 56.5%), a significant proportion were of non-White race (<i>n</i> = 272 018, 36.8%), and 54 625 (7.4%) were of Hispanic/Latino ethnicity.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>The CIRCE Project represents a novel cooperative research model to capture clinically validated EHR data from a large diverse academic health system in the greater Philadelphia region and is designed to facilitate collaboration and data sharing to support learning health system activities. Ultimately, these data will be de-identified and converted to a publicly available resource.</p>\n </section>\n </div>","PeriodicalId":43916,"journal":{"name":"Learning Health Systems","volume":"9 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11733450/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Learning Health Systems","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/lrh2.10439","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH POLICY & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
Introduction
The rapid adoption of electronic health record (EHR) systems has resulted in extensive archives of data relevant to clinical research, hospital operations, and the development of learning health systems. However, EHR data are not frequently available, cleaned, standardized, validated, and ready for use by stakeholders. We describe an in-progress effort to overcome these challenges with cooperative, systematic data extraction and validation.
Methods
A multi-disciplinary team of investigators collaborated to create the Complete Inpatient Record Using Comprehensive Electronic Data (CIRCE) Project dataset, which captures EHR data from six hospitals within the University of Pennsylvania Health System. Analysts and clinical researchers jointly iteratively reviewed SQL queries and their output to validate desired data elements. Data from patients aged ≥18 years with at least one encounter at an acute care hospital or hospice occurring since 7/1/2017 were included. The CIRCE Project includes three layers: (1) raw data comprised of direct SQL query output, (2) cleaned data with errors removed, and (3) transformed data with standardized implementations of commonly used case definitions and clinical scores.
Results
Between July 1, 2017 and December 31, 2023, the dataset captured 1 629 920 encounters from 740 035 patients. Most encounters were emergency department only visits (n = 965 834, 59.3%), followed by inpatient admissions without an intensive care unit admission (n = 518 367, 23.7%). The median age was 46.9 years (25th–75th percentiles = 31.1–64.7) at the time of the first encounter. Most patients were female (n = 418 303, 56.5%), a significant proportion were of non-White race (n = 272 018, 36.8%), and 54 625 (7.4%) were of Hispanic/Latino ethnicity.
Conclusions
The CIRCE Project represents a novel cooperative research model to capture clinically validated EHR data from a large diverse academic health system in the greater Philadelphia region and is designed to facilitate collaboration and data sharing to support learning health system activities. Ultimately, these data will be de-identified and converted to a publicly available resource.