Kenneth J Nieser, Daniel J Tancredi, Alex H S Harris
{"title":"两项公开报告的结果质量措施在退伍军人健康管理局内表征医疗保健质量的不可靠性。","authors":"Kenneth J Nieser, Daniel J Tancredi, Alex H S Harris","doi":"10.1111/1475-6773.70050","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To estimate the reliability of two outcome quality measures in Veterans Health Administration (VHA) data using three different methods.</p><p><strong>Study setting and design: </strong>We created two cohorts of VHA patients meeting criteria for two measures: (1) risk-standardized complication rates following elective primary total hip arthroplasty and/or total knee arthroplasty (THA/TKA), and (2) risk-standardized mortality rates following acute myocardial infarction hospitalization (AMI). We fit hierarchical logistic regression models and calculated facility-level risk-standardized rates. We estimated entity-level reliability using three commonly applied methods: (1) delta method approximation; (2) latent scale model; (3) split-sample method.</p><p><strong>Data sources and analytic sample: </strong>For each measure, we extracted risk adjustment and outcome data from the VHA Corporate Data Warehouse for patients meeting eligibility criteria in fiscal years 2021 and 2022.</p><p><strong>Principal findings: </strong>Most facilities had complication rates following total hip and/or knee arthroplasty and mortality rates following hospitalization for acute myocardial infarction that, statistically, were no different from the national average. Reliability estimates based on delta method approximation (0.14 for THA/TKA; 0.12 for AMI) and the split-sample method (0.12 for THA/TKA; 0.19 for AMI) were very low for both measures. As we varied the sample sizes, we found that much higher sample sizes would be needed to reliably differentiate quality of care across facilities. On the other hand, reliability estimates based on the latent scale model were substantially higher than the other two methods (0.64 for THA/TKA; 0.41 for AMI), suggesting that there is substantially more between-facility variation in latent quality than manifests in observed outcomes.</p><p><strong>Conclusions: </strong>Reliability estimates based on the latent scale approach are not numerically or conceptually interchangeable with estimates based on the other two approaches. Given that health outcomes are generally reported using observed outcomes, reliability estimation based on the latent scale approach should not be used without a strong rationale.</p>","PeriodicalId":55065,"journal":{"name":"Health Services Research","volume":" ","pages":"e70050"},"PeriodicalIF":3.2000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Unreliability of Two Publicly Reported Outcome Quality Measures for Characterizing Health Care Quality Within the Veterans Health Administration.\",\"authors\":\"Kenneth J Nieser, Daniel J Tancredi, Alex H S Harris\",\"doi\":\"10.1111/1475-6773.70050\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>To estimate the reliability of two outcome quality measures in Veterans Health Administration (VHA) data using three different methods.</p><p><strong>Study setting and design: </strong>We created two cohorts of VHA patients meeting criteria for two measures: (1) risk-standardized complication rates following elective primary total hip arthroplasty and/or total knee arthroplasty (THA/TKA), and (2) risk-standardized mortality rates following acute myocardial infarction hospitalization (AMI). We fit hierarchical logistic regression models and calculated facility-level risk-standardized rates. We estimated entity-level reliability using three commonly applied methods: (1) delta method approximation; (2) latent scale model; (3) split-sample method.</p><p><strong>Data sources and analytic sample: </strong>For each measure, we extracted risk adjustment and outcome data from the VHA Corporate Data Warehouse for patients meeting eligibility criteria in fiscal years 2021 and 2022.</p><p><strong>Principal findings: </strong>Most facilities had complication rates following total hip and/or knee arthroplasty and mortality rates following hospitalization for acute myocardial infarction that, statistically, were no different from the national average. Reliability estimates based on delta method approximation (0.14 for THA/TKA; 0.12 for AMI) and the split-sample method (0.12 for THA/TKA; 0.19 for AMI) were very low for both measures. As we varied the sample sizes, we found that much higher sample sizes would be needed to reliably differentiate quality of care across facilities. On the other hand, reliability estimates based on the latent scale model were substantially higher than the other two methods (0.64 for THA/TKA; 0.41 for AMI), suggesting that there is substantially more between-facility variation in latent quality than manifests in observed outcomes.</p><p><strong>Conclusions: </strong>Reliability estimates based on the latent scale approach are not numerically or conceptually interchangeable with estimates based on the other two approaches. Given that health outcomes are generally reported using observed outcomes, reliability estimation based on the latent scale approach should not be used without a strong rationale.</p>\",\"PeriodicalId\":55065,\"journal\":{\"name\":\"Health Services Research\",\"volume\":\" \",\"pages\":\"e70050\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2025-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Health Services Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1111/1475-6773.70050\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Services Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/1475-6773.70050","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
The Unreliability of Two Publicly Reported Outcome Quality Measures for Characterizing Health Care Quality Within the Veterans Health Administration.
Objective: To estimate the reliability of two outcome quality measures in Veterans Health Administration (VHA) data using three different methods.
Study setting and design: We created two cohorts of VHA patients meeting criteria for two measures: (1) risk-standardized complication rates following elective primary total hip arthroplasty and/or total knee arthroplasty (THA/TKA), and (2) risk-standardized mortality rates following acute myocardial infarction hospitalization (AMI). We fit hierarchical logistic regression models and calculated facility-level risk-standardized rates. We estimated entity-level reliability using three commonly applied methods: (1) delta method approximation; (2) latent scale model; (3) split-sample method.
Data sources and analytic sample: For each measure, we extracted risk adjustment and outcome data from the VHA Corporate Data Warehouse for patients meeting eligibility criteria in fiscal years 2021 and 2022.
Principal findings: Most facilities had complication rates following total hip and/or knee arthroplasty and mortality rates following hospitalization for acute myocardial infarction that, statistically, were no different from the national average. Reliability estimates based on delta method approximation (0.14 for THA/TKA; 0.12 for AMI) and the split-sample method (0.12 for THA/TKA; 0.19 for AMI) were very low for both measures. As we varied the sample sizes, we found that much higher sample sizes would be needed to reliably differentiate quality of care across facilities. On the other hand, reliability estimates based on the latent scale model were substantially higher than the other two methods (0.64 for THA/TKA; 0.41 for AMI), suggesting that there is substantially more between-facility variation in latent quality than manifests in observed outcomes.
Conclusions: Reliability estimates based on the latent scale approach are not numerically or conceptually interchangeable with estimates based on the other two approaches. Given that health outcomes are generally reported using observed outcomes, reliability estimation based on the latent scale approach should not be used without a strong rationale.
期刊介绍:
Health Services Research (HSR) is a peer-reviewed scholarly journal that provides researchers and public and private policymakers with the latest research findings, methods, and concepts related to the financing, organization, delivery, evaluation, and outcomes of health services. Rated as one of the top journals in the fields of health policy and services and health care administration, HSR publishes outstanding articles reporting the findings of original investigations that expand knowledge and understanding of the wide-ranging field of health care and that will help to improve the health of individuals and communities.