美国退伍军人健康管理局全国部署的人口健康风险算法的性能漂移。

IF 11.3 Q1 HEALTH CARE SCIENCES & SERVICES

JAMA Health Forum Pub Date : 2025-08-01 DOI:10.1001/jamahealthforum.2025.2717

Likhitha Kolla, Kristin Linn, Amol S Navathe, Craig Kreisler, Christopher B Roberts, Sae-Hwan Park, Harvineet Singh, Jean Feng, Jinbo Chen, Ravi B Parikh

{"title":"美国退伍军人健康管理局全国部署的人口健康风险算法的性能漂移。","authors":"Likhitha Kolla, Kristin Linn, Amol S Navathe, Craig Kreisler, Christopher B Roberts, Sae-Hwan Park, Harvineet Singh, Jean Feng, Jinbo Chen, Ravi B Parikh","doi":"10.1001/jamahealthforum.2025.2717","DOIUrl":null,"url":null,"abstract":"Importance: Clinical risk algorithms inform clinical decision support and system-level quality metrics. However, algorithm performance can drift over time and possibly promote misinformed decision-making and resource allocation. The Veterans Health Administration (VA) Care Assessment Needs (CAN) algorithm is a nationally deployed population risk algorithm used to predict risk of 90-day hospitalization and/or mortality and to allocate resources for more than 5 million veterans annually. However, drift affecting the VA CAN has not been assessed.Objective: To evaluate the impact of drift in the VA CAN algorithm and the extent, mechanisms, and clinical consequences of performance changes.Design, setting, and participants: This was a retrospective cohort study using electronic health records (EHRs) and administrative data from the VA Corporate Data Warehouse, which contains observations from more than 5 million veterans per study year. The data comprised 27 787 152 observations among 7 215 711 unique veterans receiving VA primary care from 2016 to 2021. Data analysis was performed from January 2023 and December 2024.Main outcomes and measures: Two primary outcomes were change in model performance (true positive rate [TPR], false positive rate [FPR], positive predictive value [PPV], negative predictive value [NPV], F1 score, and accuracy); and a national quality metric (% of veterans with CAN ≥90th percentile with a palliative care visit).Results: The study population included 7 215 711 eligible veterans, with a mean (SD) age of 62.1 (16.5); 91.2% were male and 18.2% were Black, 6.6% Hispanic, and 76.2% White individuals. From 2016 to 2021, PPV decreased by 4.0% (95% CI, -2.8% to -5.1%); F1 score decreased by 4.6% (95% CI, -6.1% to 3.0%); NPV increased by 0.43% (95% CI, 0.30% to 0.57%); and FPR increased by 0.34% (95% CI, 0.10% to 0.58%), which corresponds with 18 288 increased false positive results. TPR and accuracy remained stable. The 90-day hospitalization and/or death rates decreased from 3.8% in 2017 to 3.0% in 2021. Covariate shifts were observed in 19 covariates, with demographic characteristics, health care utilization, and laboratory covariates exhibiting the largest shifts. The palliative care quality metric was 2.9% (95% CI, 2.8% to 2.9%) in 2018, 2.6% (95% CI, 2.6% to 2.7%) in 2019, and 2.8% (95% CI, 2.7% to 2.8%) in 2020, with FPRs among metric-eligible veterans increasing from 81.6% (95% CI, 81.5% to 81.7%) in 2018 to 85.7% (95% CI, 85.6% to 85.8%) in 2020.Conclusions and relevance: This cohort study found that CAN algorithm performance declined from 2016 to 2021 due to shifts in outcome prevalence and distributions of health care utilization and demographic covariates. Close surveillance of clinical risk algorithms and quality metrics derived from algorithm-generated risk scores could mitigate suboptimal resource allocation or decision-making.","PeriodicalId":53180,"journal":{"name":"JAMA Health Forum","volume":"6 8","pages":"e252717"},"PeriodicalIF":11.3000,"publicationDate":"2025-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12357188/pdf/","citationCount":"0","resultStr":"{\"title\":\"Performance Drift in a Nationally Deployed Population Health Risk Algorithm in the US Veterans Health Administration.\",\"authors\":\"Likhitha Kolla, Kristin Linn, Amol S Navathe, Craig Kreisler, Christopher B Roberts, Sae-Hwan Park, Harvineet Singh, Jean Feng, Jinbo Chen, Ravi B Parikh\",\"doi\":\"10.1001/jamahealthforum.2025.2717\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Importance: Clinical risk algorithms inform clinical decision support and system-level quality metrics. However, algorithm performance can drift over time and possibly promote misinformed decision-making and resource allocation. The Veterans Health Administration (VA) Care Assessment Needs (CAN) algorithm is a nationally deployed population risk algorithm used to predict risk of 90-day hospitalization and/or mortality and to allocate resources for more than 5 million veterans annually. However, drift affecting the VA CAN has not been assessed.Objective: To evaluate the impact of drift in the VA CAN algorithm and the extent, mechanisms, and clinical consequences of performance changes.Design, setting, and participants: This was a retrospective cohort study using electronic health records (EHRs) and administrative data from the VA Corporate Data Warehouse, which contains observations from more than 5 million veterans per study year. The data comprised 27 787 152 observations among 7 215 711 unique veterans receiving VA primary care from 2016 to 2021. Data analysis was performed from January 2023 and December 2024.Main outcomes and measures: Two primary outcomes were change in model performance (true positive rate [TPR], false positive rate [FPR], positive predictive value [PPV], negative predictive value [NPV], F1 score, and accuracy); and a national quality metric (% of veterans with CAN ≥90th percentile with a palliative care visit).Results: The study population included 7 215 711 eligible veterans, with a mean (SD) age of 62.1 (16.5); 91.2% were male and 18.2% were Black, 6.6% Hispanic, and 76.2% White individuals. From 2016 to 2021, PPV decreased by 4.0% (95% CI, -2.8% to -5.1%); F1 score decreased by 4.6% (95% CI, -6.1% to 3.0%); NPV increased by 0.43% (95% CI, 0.30% to 0.57%); and FPR increased by 0.34% (95% CI, 0.10% to 0.58%), which corresponds with 18 288 increased false positive results. TPR and accuracy remained stable. The 90-day hospitalization and/or death rates decreased from 3.8% in 2017 to 3.0% in 2021. Covariate shifts were observed in 19 covariates, with demographic characteristics, health care utilization, and laboratory covariates exhibiting the largest shifts. The palliative care quality metric was 2.9% (95% CI, 2.8% to 2.9%) in 2018, 2.6% (95% CI, 2.6% to 2.7%) in 2019, and 2.8% (95% CI, 2.7% to 2.8%) in 2020, with FPRs among metric-eligible veterans increasing from 81.6% (95% CI, 81.5% to 81.7%) in 2018 to 85.7% (95% CI, 85.6% to 85.8%) in 2020.Conclusions and relevance: This cohort study found that CAN algorithm performance declined from 2016 to 2021 due to shifts in outcome prevalence and distributions of health care utilization and demographic covariates. Close surveillance of clinical risk algorithms and quality metrics derived from algorithm-generated risk scores could mitigate suboptimal resource allocation or decision-making.\",\"PeriodicalId\":53180,\"journal\":{\"name\":\"JAMA Health Forum\",\"volume\":\"6 8\",\"pages\":\"e252717\"},\"PeriodicalIF\":11.3000,\"publicationDate\":\"2025-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12357188/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JAMA Health Forum\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1001/jamahealthforum.2025.2717\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMA Health Forum","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1001/jamahealthforum.2025.2717","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}

引用次数: 0

摘要

重要性：临床风险算法为临床决策支持和系统级质量度量提供信息。然而，算法性能可能随着时间的推移而漂移，并可能导致错误的决策和资源分配。退伍军人健康管理局（VA）护理评估需求（CAN）算法是一种全国部署的人口风险算法，用于预测90天住院和/或死亡的风险，并为每年500多万退伍军人分配资源。然而，影响VA CAN的漂移尚未得到评估。目的：评估漂移对VA CAN算法的影响，以及性能变化的程度、机制和临床后果。设计、设置和参与者：这是一项回顾性队列研究，使用电子健康记录（EHRs）和来自VA公司数据仓库的管理数据，每个研究年包含来自500多万退伍军人的观察结果。数据包括27 787 152次观察，7 215 711名在2016年至2021年期间接受VA初级保健的独特退伍军人。数据分析时间为2023年1月至2024年12月。主要结局和测量方法：两个主要结局是模型性能的变化（真阳性率[TPR]、假阳性率[FPR]、阳性预测值[PPV]、阴性预测值[NPV]、F1评分和准确率）；和国家质量指标（CAN≥90百分位数的退伍军人接受姑息治疗的百分比）。结果：研究人群包括7名 215 711名符合条件的退伍军人，平均（SD）年龄为62.1（16.5）岁；91.2%为男性，18.2%为黑人，6.6%为西班牙裔，76.2%为白人。从2016年到2021年，PPV下降了4.0% (95% CI, -2.8%至-5.1%)；F1评分下降4.6% (95% CI, -6.1% ~ 3.0%)；NPV增加0.43% (95% CI, 0.30% ~ 0.57%)；FPR增加0.34% (95% CI, 0.10% ~ 0.58%)，对应假阳性结果增加18 288例。TPR和准确度保持稳定。90天住院和/或死亡率从2017年的3.8%降至2021年的3.0%。在19个协变量中观察到协变量的变化，其中人口统计学特征、医疗保健利用和实验室协变量的变化最大。缓和医疗质量指标在2018年为2.9% (95% CI， 2.8%至2.9%)，2019年为2.6% (95% CI， 2.6%至2.7%)，2020年为2.8% (95% CI， 2.7%至2.8%)，符合指标的退伍军人的fpr从2018年的81.6% （95% CI， 81.5%至81.7%）增加到2020年的85.7% （95% CI， 85.6%至85.8%）。结论和相关性：该队列研究发现，由于结果患病率、医疗保健利用分布和人口统计协变量的变化，CAN算法的性能从2016年到2021年有所下降。密切监测临床风险算法和由算法生成的风险评分得出的质量指标可以减轻次优资源分配或决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Performance Drift in a Nationally Deployed Population Health Risk Algorithm in the US Veterans Health Administration.

Importance: Clinical risk algorithms inform clinical decision support and system-level quality metrics. However, algorithm performance can drift over time and possibly promote misinformed decision-making and resource allocation. The Veterans Health Administration (VA) Care Assessment Needs (CAN) algorithm is a nationally deployed population risk algorithm used to predict risk of 90-day hospitalization and/or mortality and to allocate resources for more than 5 million veterans annually. However, drift affecting the VA CAN has not been assessed.

Objective: To evaluate the impact of drift in the VA CAN algorithm and the extent, mechanisms, and clinical consequences of performance changes.

Design, setting, and participants: This was a retrospective cohort study using electronic health records (EHRs) and administrative data from the VA Corporate Data Warehouse, which contains observations from more than 5 million veterans per study year. The data comprised 27 787 152 observations among 7 215 711 unique veterans receiving VA primary care from 2016 to 2021. Data analysis was performed from January 2023 and December 2024.

Main outcomes and measures: Two primary outcomes were change in model performance (true positive rate [TPR], false positive rate [FPR], positive predictive value [PPV], negative predictive value [NPV], F1 score, and accuracy); and a national quality metric (% of veterans with CAN ≥90th percentile with a palliative care visit).

Results: The study population included 7 215 711 eligible veterans, with a mean (SD) age of 62.1 (16.5); 91.2% were male and 18.2% were Black, 6.6% Hispanic, and 76.2% White individuals. From 2016 to 2021, PPV decreased by 4.0% (95% CI, -2.8% to -5.1%); F1 score decreased by 4.6% (95% CI, -6.1% to 3.0%); NPV increased by 0.43% (95% CI, 0.30% to 0.57%); and FPR increased by 0.34% (95% CI, 0.10% to 0.58%), which corresponds with 18 288 increased false positive results. TPR and accuracy remained stable. The 90-day hospitalization and/or death rates decreased from 3.8% in 2017 to 3.0% in 2021. Covariate shifts were observed in 19 covariates, with demographic characteristics, health care utilization, and laboratory covariates exhibiting the largest shifts. The palliative care quality metric was 2.9% (95% CI, 2.8% to 2.9%) in 2018, 2.6% (95% CI, 2.6% to 2.7%) in 2019, and 2.8% (95% CI, 2.7% to 2.8%) in 2020, with FPRs among metric-eligible veterans increasing from 81.6% (95% CI, 81.5% to 81.7%) in 2018 to 85.7% (95% CI, 85.6% to 85.8%) in 2020.

Conclusions and relevance: This cohort study found that CAN algorithm performance declined from 2016 to 2021 due to shifts in outcome prevalence and distributions of health care utilization and demographic covariates. Close surveillance of clinical risk algorithms and quality metrics derived from algorithm-generated risk scores could mitigate suboptimal resource allocation or decision-making.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

JAMA Health Forum

CiteScore

4.00

自引率

7.80%

发文量

期刊介绍： JAMA Health Forum is an international, peer-reviewed, online, open access journal that addresses health policy and strategies affecting medicine, health, and health care. The journal publishes original research, evidence-based reports, and opinion about national and global health policy. It covers innovative approaches to health care delivery and health care economics, access, quality, safety, equity, and reform. In addition to publishing articles, JAMA Health Forum also features commentary from health policy leaders on the JAMA Forum. It covers news briefs on major reports released by government agencies, foundations, health policy think tanks, and other policy-focused organizations. JAMA Health Forum is a member of the JAMA Network, which is a consortium of peer-reviewed, general medical and specialty publications. The journal presents curated health policy content from across the JAMA Network, including journals such as JAMA and JAMA Internal Medicine.