Samir Gupta, Lin Liu, Olga V Patterson, Ashley Earles, Ranier Bustamante, Andrew J Gawron, William K Thompson, William Scuba, Daniel Denhalter, M Elena Martinez, Karen Messer, Deborah A Fisher, Sameer D Saini, Scott L DuVall, Wendy W Chapman, Mary A Whooley, Tonya Kaltenbach
{"title":"A Framework for Leveraging \"Big Data\" to Advance Epidemiology and Improve Quality: Design of the VA Colonoscopy Collaborative.","authors":"Samir Gupta, Lin Liu, Olga V Patterson, Ashley Earles, Ranier Bustamante, Andrew J Gawron, William K Thompson, William Scuba, Daniel Denhalter, M Elena Martinez, Karen Messer, Deborah A Fisher, Sameer D Saini, Scott L DuVall, Wendy W Chapman, Mary A Whooley, Tonya Kaltenbach","doi":"10.5334/egems.198","DOIUrl":"10.5334/egems.198","url":null,"abstract":"<p><strong>Objective: </strong>To describe a framework for leveraging big data for research and quality improvement purposes and demonstrate implementation of the framework for design of the Department of Veterans Affairs (VA) Colonoscopy Collaborative.</p><p><strong>Methods: </strong>We propose that research utilizing large-scale electronic health records (EHRs) can be approached in a 4 step framework: 1) Identify data sources required to answer research question; 2) Determine whether variables are available as structured or free-text data; 3) Utilize a rigorous approach to refine variables and assess data quality; 4) Create the analytic dataset and perform analyses. We describe implementation of the framework as part of the VA Colonoscopy Collaborative, which aims to leverage big data to 1) prospectively measure and report colonoscopy quality and 2) develop and validate a risk prediction model for colorectal cancer (CRC) and high-risk polyps.</p><p><strong>Results: </strong>Examples of implementation of the 4 step framework are provided. To date, we have identified 2,337,171 Veterans who have undergone colonoscopy between 1999 and 2014. Median age was 62 years, and 4.6 percent (n = 106,860) were female. We estimated that 2.6 percent (n = 60,517) had CRC diagnosed at baseline. An additional 1 percent (n = 24,483) had a new ICD-9 code-based diagnosis of CRC on follow up.</p><p><strong>Conclusion: </strong>We hope our framework may contribute to the dialogue on best practices to ensure high quality epidemiologic and quality improvement work. As a result of implementation of the framework, the VA Colonoscopy Collaborative holds great promise for 1) quantifying and providing novel understandings of colonoscopy outcomes, and 2) building a robust approach for nationwide VA colonoscopy quality reporting.</p>","PeriodicalId":72880,"journal":{"name":"EGEMS (Washington, DC)","volume":" ","pages":"4"},"PeriodicalIF":0.0,"publicationDate":"2018-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5983017/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36205113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Laura Goettinger Qualls, Thomas A Phillips, Bradley G Hammill, James Topping, Darcy M Louzao, Jeffrey S Brown, Lesley H Curtis, Keith Marsolo
{"title":"Evaluating Foundational Data Quality in the National Patient-Centered Clinical Research Network (PCORnet®).","authors":"Laura Goettinger Qualls, Thomas A Phillips, Bradley G Hammill, James Topping, Darcy M Louzao, Jeffrey S Brown, Lesley H Curtis, Keith Marsolo","doi":"10.5334/egems.199","DOIUrl":"https://doi.org/10.5334/egems.199","url":null,"abstract":"<p><strong>Introduction: </strong>Distributed research networks (DRNs) are critical components of the strategic roadmaps for the National Institutes of Health and the Food and Drug Administration as they work to move toward large-scale systems of evidence generation. The National Patient-Centered Clinical Research Network (PCORnet®) is one of the first DRNs to incorporate electronic health record data from multiple domains on a national scale. Before conducting analyses in a DRN, it is important to assess the quality and characteristics of the data.</p><p><strong>Methods: </strong>PCORnet's Coordinating Center is responsible for evaluating foundational data quality, or assessing fitness-for-use across a broad research portfolio, through a process called data curation. Data curation involves a set of analytic and querying activities to assess data quality coupled with maintenance of detailed documentation and ongoing communication with network partners. The first cycle of PCORnet data curation focused on six domains in the PCORnet common data model: demographics, diagnoses, encounters, enrollment, procedures, and vitals.</p><p><strong>Results: </strong>The data curation process led to improvements in foundational data quality. Notable improvements included the elimination of data model conformance errors; a decrease in implausible height, weight, and blood pressure values; an increase in the volume of diagnoses and procedures; and more complete data for key analytic variables. Based on the findings of the first cycle, we made modifications to the curation process to increase efficiencies and further reduce variation among data partners.</p><p><strong>Discussion: </strong>The iterative nature of the data curation process allows PCORnet to gradually increase the foundational level of data quality and reduce variability across the network. These activities help increase the transparency and reproducibility of analyses within PCORnet and can serve as a model for other DRNs.</p>","PeriodicalId":72880,"journal":{"name":"EGEMS (Washington, DC)","volume":" ","pages":"3"},"PeriodicalIF":0.0,"publicationDate":"2018-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5983028/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36205112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Systematic Method for Exploring Data Attributes in Preparation for Designing Tailored Infographics of Patient Reported Outcomes.","authors":"Adriana Arcia, Janet Woollen, Suzanne Bakken","doi":"10.5334/egems.190","DOIUrl":"10.5334/egems.190","url":null,"abstract":"<p><strong>Context: </strong>Tailored visualizations of patient reported outcomes (PROs) are valuable health communication tools to support shared decision making, health self-management, and engagement with research participants, such as cohorts in the NIH Precision Medicine Initiative. The automation of visualizations presents some unique design challenges. Efficient design processes depend upon gaining a thorough understanding of the data prior to prototyping.</p><p><strong>Case description: </strong>We present a systematic method to exploring data attributes, with a specific focus on application to self-reported health data. The method entails a) determining the meaning of the variable to be visualized, b) identifying the possible and likely values, and c) understanding how values are interpreted.</p><p><strong>Findings: </strong>We present two case studies to illustrate how this method affected our design decisions particularly with respect to outlier and non-missing zero values.</p><p><strong>Major themes: </strong>The use of a systematic method made our process of exploring data attributes easily manageable. The limitations of the data can narrow design options but can also prompt creative solutions and innovative design opportunities.</p><p><strong>Conclusion: </strong>A systematic method of exploration of data contributes to an efficient design process, uncovers design opportunities, and alerts the designer to design challenges.</p>","PeriodicalId":72880,"journal":{"name":"EGEMS (Washington, DC)","volume":" ","pages":"2"},"PeriodicalIF":0.0,"publicationDate":"2018-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5983055/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36205111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lucas W Thornblade, David R Flum, Abraham D Flaxman
{"title":"Predicting Future Elective Colon Resection for Diverticulitis Using Patterns of Health Care Utilization.","authors":"Lucas W Thornblade, David R Flum, Abraham D Flaxman","doi":"10.5334/egems.193","DOIUrl":"https://doi.org/10.5334/egems.193","url":null,"abstract":"<p><strong>Background: </strong>Recurrent diverticulitis is the most common reason for elective colon surgery and, although professional societies now recommend against early resection, its use continues to rise. Shared decision making decreases use of low-value surgery but identifying which patients are most likely to elect surgery has proven difficult. We hypothesized that Machine Learning algorithms using health care utilization (HCU) data can predict future clinical events including early resection for diverticulitis.</p><p><strong>Study design: </strong>We developed models for predicting future surgery among patients with new diagnoses of diverticulitis (2009-2012) from the MarketScan® database. Claims data (diagnosis, procedural, and drug codes) were used to train three Machine Learning algorithms to predict surgery occurring between 52 and 104 weeks following diagnosis.</p><p><strong>Results: </strong>Of 82,231 patients with incident diverticulitis (age 51 ± 8 years, 52% female), 1.2% went on to elective colon resection. Using maximal training data (152 consecutive weeks of claims), the Gradient Boosting Machine model predicted elective surgery with an area under the curve (AUC) of 75% (95% uncertainty interval [UI] 71-79%). Models trained on less data resulted in less accurate prediction (AUC: 68% [64-74%] using 128 weeks, 57% [53-63%] using 104 weeks). The majority of resections (85%) were identified as low-value.</p><p><strong>Conclusion: </strong>By applying Machine Learning to HCU data from the time around a diagnosis of diverticulitis, we predicted elective surgery weeks to months in advance, with moderate accuracy. Identifying patients who are most likely to elect surgery for diverticulitis provides an opportunity for effective shared decision making initiatives aimed at reducing the use of costly low-value care.</p>","PeriodicalId":72880,"journal":{"name":"EGEMS (Washington, DC)","volume":" ","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2018-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5983027/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36205110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Friedrich Maximilian von Recklinghausen, Andreas Taenzer, Chrissie Gorman, Jay Knowlton, Allison Kinslow, Ron Russell
{"title":"The Truth is in the Data - Differences in the Same Measure Based on Different Sources among HVHC Members Using ICU Length of Stay as an Example.","authors":"Friedrich Maximilian von Recklinghausen, Andreas Taenzer, Chrissie Gorman, Jay Knowlton, Allison Kinslow, Ron Russell","doi":"10.5334/egems.194","DOIUrl":"https://doi.org/10.5334/egems.194","url":null,"abstract":"<p><strong>Introduction: </strong>Intensive Care Unit (ICU) length of stay is a strong indicator of severity of illness and cost in the care of sepsis patients. In this case study, we examine the difference between an electronic health record (EHR) based submissions with Centers for Medicare and Medicaid Services (CMS) payment data.</p><p><strong>Methods: </strong>Member submitted EHR data contained 26,733 unique patient's records. The CMS data contained demographics, diagnosis, and revenue codes. After linking EHR data to CMS data, we found a discrepancy in ICU days from CMS claims vs. EHR data. Our hypothesis was that removing intermediate ICU LOS would result in a closer match from CMS claims with EHR data. We suspected the use of Intermediate ICU stays in our CMS ICU definition contaminated our ICU LOS data. This resulted in a review of the sepsis specification, further investigation of the data, and follow up conversations with the Member organizations.</p><p><strong>Results: </strong>Agreement between EHR and CMS data improved from 73 percent to 86 percent once the Intermediate ICU time had been removed.</p><p><strong>Discussion and conclusions: </strong>The inclusion of Intermediate ICU in the analysis of severely ill sepsis patients from CMS data diluted the importance of using an ICU LOS for estimating the severity of illness and the cost to the healthcare system. We must ensure that clinical definitions are consistent between data sources that were built for different purposes. Additionally, we learned that engaging with clinicians, analysts, and clinical coders early in the process is required to fully understand the complexities from different sources.</p>","PeriodicalId":72880,"journal":{"name":"EGEMS (Washington, DC)","volume":"5 3","pages":"3"},"PeriodicalIF":0.0,"publicationDate":"2017-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5982996/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36204680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Brent C James, David P Edwards, Alan F James, Richard L Bradshaw, Keith S White, Chris Wood, Stan Huff
{"title":"An Efficient, Clinically-Natural Electronic Medical Record System that Produces Computable Data.","authors":"Brent C James, David P Edwards, Alan F James, Richard L Bradshaw, Keith S White, Chris Wood, Stan Huff","doi":"10.5334/egems.202","DOIUrl":"https://doi.org/10.5334/egems.202","url":null,"abstract":"<p><p>Current commercially-available electronic medical record systems produce mainly text-based information focused on financial and regulatory performance. We combined an existing method for organizing complex computer systems-which we label activity-based design-with a proven approach for integrating clinical decision support into front-line care delivery-Care Process Models. The clinical decision support approach increased the structure of textual clinical documentation, to the point where established methods for converting text into computable data (natural language processing) worked efficiently. In a simple trial involving radiology reports for examinations performed to rule out pneumonia, more than 98 percent of all documentation generated was captured as computable data. Use cases across a broad range of other physician, nursing, and physical therapy clinical applications subjectively show similar effects. The resulting system is clinically natural, puts clinicians in direct, rapid control of clinical content without information technology intermediaries, and can generate complete clinical documentation. It supports embedded secondary functions such as the generation of granular activity-based costing data, and embedded generation of clinical coding (e.g., CPT, ICD-10 or SNOMED). Most important, widely-available computable data has the potential to greatly improve care delivery management and outcomes.</p>","PeriodicalId":72880,"journal":{"name":"EGEMS (Washington, DC)","volume":"5 3","pages":"8"},"PeriodicalIF":0.0,"publicationDate":"2017-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5982922/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36204683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Effect of the Hospital Readmission Reduction Program on the Duration of Observation Stays: Using Regression Discontinuity to Estimate Causal Effects.","authors":"Jordan Albritton, Thomas Belnap, Lucy Savitz","doi":"10.5334/egems.197","DOIUrl":"https://doi.org/10.5334/egems.197","url":null,"abstract":"<p><strong>Research objective: </strong>Determine whether hospitals are increasing the duration of observation stays following index admission for heart failure to avoid potential payment penalties from the Hospital Readmission Reduction Program.</p><p><strong>Study design: </strong>The Hospital Readmission Reduction Program applies a 30-day cutoff after which readmissions are no longer penalized. Given this seemingly arbitrary cutoff, we use regression discontinuity design, a quasi-experimental research design that can be used to make causal inferences.</p><p><strong>Population studied: </strong>The High Value Healthcare Collaborative includes member healthcare systems covering 57% of the nation's hospital referral regions. We used Medicare claims data including all patients residing within these regions. The study included patients with index admissions for heart failure from January 1, 2012 to June 30, 2015 and a subsequent observation stay within 60 days. We excluded hospitals with fewer than 25 heart failure readmissions in a year or fewer than 5 observation stays in a year and patients with subsequent observation stays at a different hospital.</p><p><strong>Principal findings: </strong>Overall, there was no discontinuity at the 30-day cutoff in the duration of observation stays, the percent of observation stays over 12 hours, or the percent of observation stays over 24 hours. In the sub-analysis, the discontinuity was significant for non-penalized.</p><p><strong>Conclusion: </strong>The findings reveal evidence that the HRRP has resulted in an increase in the duration of observation stays for some non-penalized hospitals.</p>","PeriodicalId":72880,"journal":{"name":"EGEMS (Washington, DC)","volume":"5 3","pages":"6"},"PeriodicalIF":0.0,"publicationDate":"2017-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5994952/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36247818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andrew J Knighton, Kimberly D Brunisholz, Samuel T Savitz
{"title":"Detecting Risk of Low Health Literacy in Disadvantaged Populations Using Area-based Measures.","authors":"Andrew J Knighton, Kimberly D Brunisholz, Samuel T Savitz","doi":"10.5334/egems.191","DOIUrl":"https://doi.org/10.5334/egems.191","url":null,"abstract":"<p><strong>Introduction: </strong>Socio-economic status (SES) and low health literacy (LHL) are closely correlated. Both are directly associated with clinical and behavioral risk factors and healthcare outcomes. Learning healthcare systems are introducing small-area measures to address the challenges associated with maintaining patient-reported measures of SES and LHL. This study's purpose was to measure the association between two available census block measures associated with SES and LHL. Understanding the relationship can guide the identification of a multi-purpose area based measure for delivery system use.</p><p><strong>Methods: </strong>A retrospective observational design was deployed using all US Census block groups in Utah. The principal dependent variable was a nationally-standardized health literacy score (HLS). The primary explanatory variable was a state-standardized area deprivation index (ADI). Statistical methods included linear regression and tests of association. Receiver operating characteristic (ROC) analysis was used to develop LHL criteria using ADI.</p><p><strong>Results: </strong>A significant negative association between the HLS and the ADI score remained after adjusting for area-level risk factors (β: -0.21 (95% CI: -0.22, -0.19) p < .001). Eighteen block groups (<1%) were identified as having LHL using HLS. A combination of three or more ADI components correlated with LHL predicted 78% of HLS LHL block groups and 35 additional block groups not identified using HLS (c-statistic: 0.64; 95% CI: 0.62, 0.66).</p><p><strong>Conclusions: </strong>HLS and ADI use differing measurement criteria but are closely correlated. A state-based ADI detected additional neighborhoods with risk of LHL compared to use of a national HLS. An ADI represents a multi-purpose area measure of social determinants useful for learning health systems tailoring care.</p>","PeriodicalId":72880,"journal":{"name":"EGEMS (Washington, DC)","volume":"5 3","pages":"7"},"PeriodicalIF":0.0,"publicationDate":"2017-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.5334/egems.191","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36247819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genna R Cohen, David J Jones, Jessica Heeringa, Kirsten Barrett, Michael F Furukawa, Dan Miller, Anne Mutti, James D Reschovsky, Rachel Machta, Stephen M Shortell, Taressa Fraze, Eugene Rich
{"title":"Leveraging Diverse Data Sources to Identify and Describe U.S. Health Care Delivery Systems.","authors":"Genna R Cohen, David J Jones, Jessica Heeringa, Kirsten Barrett, Michael F Furukawa, Dan Miller, Anne Mutti, James D Reschovsky, Rachel Machta, Stephen M Shortell, Taressa Fraze, Eugene Rich","doi":"10.5334/egems.200","DOIUrl":"10.5334/egems.200","url":null,"abstract":"<p><p>Health care delivery systems are a growing presence in the U.S., yet research is hindered by the lack of universally agreed-upon criteria to denote formal systems. A clearer understanding of how to leverage real-world data sources to empirically identify systems is a necessary first step to such policy-relevant research. We draw from our experience in the Agency for Healthcare Research and Quality's Comparative Health System Performance (CHSP) initiative to assess available data sources to identify and describe systems, including system members (for example, hospitals and physicians) and relationships among the members (for example, hospital ownership of physician groups). We highlight five national data sources that either explicitly track system membership or detail system relationships: (1) American Hospital Association annual survey of hospitals; (2) Healthcare Relational Services Databases; (3) SK&A Healthcare Databases; (4) Provider Enrollment, Chain, and Ownership System; and (5) Internal Revenue Service 990 forms. Each data source has strengths and limitations for identifying and describing systems due to their varied content, linkages across data sources, and data collection methods. In addition, although no single national data source provides a complete picture of U.S. systems and their members, the CHSP initiative will create an early model of how such data can be combined to compensate for their individual limitations. Identifying systems in a way that can be repeated over time and linked to a host of other data sources will support analysis of how different types of organizations deliver health care and, ultimately, comparison of their performance.</p>","PeriodicalId":72880,"journal":{"name":"EGEMS (Washington, DC)","volume":"5 3","pages":"9"},"PeriodicalIF":0.0,"publicationDate":"2017-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5983023/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36205109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jay Knowlton, Tom Belnap, Bonnie Patelesio, Elisa L Priest, Friedrich von Recklinghausen, Andreas H Taenzer
{"title":"A Framework for Aligning Data from Multiple Institutions to Conduct Meaningful Analytics.","authors":"Jay Knowlton, Tom Belnap, Bonnie Patelesio, Elisa L Priest, Friedrich von Recklinghausen, Andreas H Taenzer","doi":"10.5334/egems.195","DOIUrl":"https://doi.org/10.5334/egems.195","url":null,"abstract":"<p><strong>Introduction: </strong>Health systems can be supported by collaborative networks focused on data sharing and comparative analytics to identify and rapidly disseminate promising care practices. Standardized data collection, quality assessment, and cleansing is a necessary process to facilitate meaningful analytics for operations, quality improvement, and research. We developed a framework for aligning data from health care delivery systems using the High Value Healthcare Collaborative central registry.</p><p><strong>Framework: </strong>The centralized data registry model allows for multiple layers of data quality assessment. Our framework uses an iterative approach, starting with clear specifications, maintaining ongoing dialogue with diverse stakeholders, and regular checkpoints to assess data conformance, completeness, and plausibility.</p><p><strong>Lessons learned: </strong>We found that an iterative communication process is critical for a central registry to ensure: 1) clarity of data specifications, 2) appropriate data quality, and 3) thorough understanding of data source, purpose, and context. Engaging teams from all participating institutions and incorporating diverse stakeholders of clinicians, information technologists, data analysts, operations managers, and health services researchers in all decision making processes supports development of high quality datasets for comparative analytics across multiple institutions.</p><p><strong>Conclusion: </strong>A standard data specification and submission process alone does not guarantee aligned data for a collaborative registry. Implementing an iterative data quality improvement framework with extensive communication proved to be effective for aligning data from multiple institutions to support meaningful analytics.</p>","PeriodicalId":72880,"journal":{"name":"EGEMS (Washington, DC)","volume":"5 3","pages":"2"},"PeriodicalIF":0.0,"publicationDate":"2017-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5982973/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36204679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}