Emerging algorithmic bias: fairness drift as the next dimension of model maintenance and sustainability.

IF 4.6 2区医学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of the American Medical Informatics Association Pub Date : 2025-05-01 DOI:10.1093/jamia/ocaf039

Sharon E Davis, Chad Dorn, Daniel J Park, Michael E Matheny

{"title":"Emerging algorithmic bias: fairness drift as the next dimension of model maintenance and sustainability.","authors":"Sharon E Davis, Chad Dorn, Daniel J Park, Michael E Matheny","doi":"10.1093/jamia/ocaf039","DOIUrl":null,"url":null,"abstract":"Objectives: While performance drift of clinical prediction models is well-documented, the potential for algorithmic biases to emerge post-deployment has had limited characterization. A better understanding of how temporal model performance may shift across subpopulations is required to incorporate fairness drift into model maintenance strategies.Materials and methods: We explore fairness drift in a national population over 11 years, with and without model maintenance aimed at sustaining population-level performance. We trained random forest models predicting 30-day post-surgical readmission, mortality, and pneumonia using 2013 data from US Department of Veterans Affairs facilities. We evaluated performance quarterly from 2014 to 2023 by self-reported race and sex. We estimated discrimination, calibration, and accuracy, and operationalized fairness using metric parity measured as the gap between disadvantaged and advantaged groups.Results: Our cohort included 1 739 666 surgical cases. We observed fairness drift in both the original and temporally updated models. Model updating had a larger impact on overall performance than fairness gaps. During periods of stable fairness, updating models at the population level increased, decreased, or did not impact fairness gaps. During periods of fairness drift, updating models restored fairness in some cases and exacerbated fairness gaps in others.Discussion: This exploratory study highlights that algorithmic fairness cannot be assured through one-time assessments during model development. Temporal changes in fairness may take multiple forms and interact with model updating strategies in unanticipated ways.Conclusion: Equitable and sustainable clinical artificial intelligence deployments will require novel methods to monitor algorithmic fairness, detect emerging bias, and adopt model updates that promote fairness.","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":"845-854"},"PeriodicalIF":4.6000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12012346/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocaf039","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Objectives: While performance drift of clinical prediction models is well-documented, the potential for algorithmic biases to emerge post-deployment has had limited characterization. A better understanding of how temporal model performance may shift across subpopulations is required to incorporate fairness drift into model maintenance strategies.

Materials and methods: We explore fairness drift in a national population over 11 years, with and without model maintenance aimed at sustaining population-level performance. We trained random forest models predicting 30-day post-surgical readmission, mortality, and pneumonia using 2013 data from US Department of Veterans Affairs facilities. We evaluated performance quarterly from 2014 to 2023 by self-reported race and sex. We estimated discrimination, calibration, and accuracy, and operationalized fairness using metric parity measured as the gap between disadvantaged and advantaged groups.

Results: Our cohort included 1 739 666 surgical cases. We observed fairness drift in both the original and temporally updated models. Model updating had a larger impact on overall performance than fairness gaps. During periods of stable fairness, updating models at the population level increased, decreased, or did not impact fairness gaps. During periods of fairness drift, updating models restored fairness in some cases and exacerbated fairness gaps in others.

Discussion: This exploratory study highlights that algorithmic fairness cannot be assured through one-time assessments during model development. Temporal changes in fairness may take multiple forms and interact with model updating strategies in unanticipated ways.

Conclusion: Equitable and sustainable clinical artificial intelligence deployments will require novel methods to monitor algorithmic fairness, detect emerging bias, and adopt model updates that promote fairness.

查看原文本刊更多论文

新出现的算法偏差：公平性漂移是模型维护和可持续性的下一个维度。

目的：虽然临床预测模型的性能漂移已被充分证明，但部署后出现算法偏差的可能性有限。为了将公平性漂移纳入模型维护策略，需要更好地理解时间模型性能如何在亚种群中转移。材料和方法：我们研究了11年来全国人口的公平漂移，有和没有旨在维持人口水平表现的模型维护。我们使用2013年美国退伍军人事务部设施的数据训练随机森林模型，预测术后30天的再入院、死亡率和肺炎。从2014年到2023年，我们每季度评估一次自我报告的种族和性别。我们估计歧视、校准和准确性，并使用衡量弱势群体和优势群体之间差距的度量平价来实现公平性。结果：我们的队列包括1 739666例手术病例。我们在原始和临时更新的模型中都观察到公平漂移。模型更新对整体表现的影响大于公平差距。在稳定的公平时期，在人口水平上更新模型会增加、减少或不影响公平差距。在公平漂移期间，更新模型在某些情况下恢复了公平，在其他情况下加剧了公平差距。讨论：本探索性研究强调，不能通过模型开发过程中的一次性评估来保证算法的公平性。公平的时间变化可能采取多种形式，并以意想不到的方式与模型更新策略相互作用。结论：公平和可持续的临床人工智能部署将需要新的方法来监控算法的公平性，检测新出现的偏见，并采用促进公平性的模型更新。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of the American Medical Informatics Association 医学-计算机：跨学科应用

CiteScore

14.50

自引率

7.80%

发文量

230

审稿时长

3-8 weeks

期刊介绍： JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.