Evaluating machine learning model bias and racial disparities in non-small cell lung cancer using SEER registry data.

IF 2 3区医学 Q2 HEALTH POLICY & SERVICES

Health Care Management Science Pub Date : 2024-12-01 Epub Date: 2024-11-04 DOI:10.1007/s10729-024-09691-6

Cameron Trentz, Jacklyn Engelbart, Jason Semprini, Amanda Kahl, Eric Anyimadu, John Buatti, Thomas Casavant, Mary Charlton, Guadalupe Canahuate

{"title":"Evaluating machine learning model bias and racial disparities in non-small cell lung cancer using SEER registry data.","authors":"Cameron Trentz, Jacklyn Engelbart, Jason Semprini, Amanda Kahl, Eric Anyimadu, John Buatti, Thomas Casavant, Mary Charlton, Guadalupe Canahuate","doi":"10.1007/s10729-024-09691-6","DOIUrl":null,"url":null,"abstract":"Background: Despite decades of pursuing health equity, racial and ethnic disparities persist in healthcare in America. For cancer specifically, one of the leading observed disparities is worse mortality among non-Hispanic Black patients compared to non-Hispanic White patients across the cancer care continuum. These real-world disparities are reflected in the data used to inform the decisions made to alleviate such inequities. Failing to account for inherently biased data underlying these observations could intensify racial cancer disparities and lead to misguided efforts that fail to appropriately address the real causes of health inequity.Objective: Estimate the racial/ethnic bias of machine learning models in predicting two-year survival and surgery treatment recommendation for non-small cell lung cancer (NSCLC) patients.Methods: A Cox survival model, and a LOGIT model as well as three other machine learning models for predicting surgery recommendation were trained using SEER data from NSCLC patients diagnosed from 2000-2018. Models were trained with a 70/30 train/test split (both including and excluding race/ethnicity) and evaluated using performance and fairness metrics. The effects of oversampling the training data were also evaluated.Results: The survival models show disparate impact towards non-Hispanic Black patients regardless of whether race/ethnicity is used as a predictor. The models including race/ethnicity amplified the disparities observed in the data. The exclusion of race/ethnicity as a predictor in the survival and surgery recommendation models improved fairness metrics without degrading model performance. Stratified oversampling strategies reduced disparate impact while reducing the accuracy of the model.Conclusion: NSCLC disparities are complex and multifaceted. Yet, even when accounting for age and stage at diagnosis, non-Hispanic Black patients with NSCLC are less often recommended to have surgery than non-Hispanic White patients. Machine learning models amplified the racial/ethnic disparities across the cancer care continuum (which are reflected in the data used to make model decisions). Excluding race/ethnicity lowered the bias of the models but did not affect disparate impact. Developing analytical strategies to improve fairness would in turn improve the utility of machine learning approaches analyzing population-based cancer data.","PeriodicalId":12903,"journal":{"name":"Health Care Management Science","volume":" ","pages":"631-649"},"PeriodicalIF":2.0000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Health Care Management Science","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s10729-024-09691-6","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/4 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"HEALTH POLICY & SERVICES","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Despite decades of pursuing health equity, racial and ethnic disparities persist in healthcare in America. For cancer specifically, one of the leading observed disparities is worse mortality among non-Hispanic Black patients compared to non-Hispanic White patients across the cancer care continuum. These real-world disparities are reflected in the data used to inform the decisions made to alleviate such inequities. Failing to account for inherently biased data underlying these observations could intensify racial cancer disparities and lead to misguided efforts that fail to appropriately address the real causes of health inequity.

Objective: Estimate the racial/ethnic bias of machine learning models in predicting two-year survival and surgery treatment recommendation for non-small cell lung cancer (NSCLC) patients.

Methods: A Cox survival model, and a LOGIT model as well as three other machine learning models for predicting surgery recommendation were trained using SEER data from NSCLC patients diagnosed from 2000-2018. Models were trained with a 70/30 train/test split (both including and excluding race/ethnicity) and evaluated using performance and fairness metrics. The effects of oversampling the training data were also evaluated.

Results: The survival models show disparate impact towards non-Hispanic Black patients regardless of whether race/ethnicity is used as a predictor. The models including race/ethnicity amplified the disparities observed in the data. The exclusion of race/ethnicity as a predictor in the survival and surgery recommendation models improved fairness metrics without degrading model performance. Stratified oversampling strategies reduced disparate impact while reducing the accuracy of the model.

Conclusion: NSCLC disparities are complex and multifaceted. Yet, even when accounting for age and stage at diagnosis, non-Hispanic Black patients with NSCLC are less often recommended to have surgery than non-Hispanic White patients. Machine learning models amplified the racial/ethnic disparities across the cancer care continuum (which are reflected in the data used to make model decisions). Excluding race/ethnicity lowered the bias of the models but did not affect disparate impact. Developing analytical strategies to improve fairness would in turn improve the utility of machine learning approaches analyzing population-based cancer data.

查看原文本刊更多论文

利用 SEER 登记数据评估非小细胞肺癌的机器学习模型偏差和种族差异。

背景：尽管几十年来美国一直在追求健康公平，但在医疗保健方面种族和民族差异依然存在。具体就癌症而言，观察到的主要差距之一是在整个癌症治疗过程中，非西班牙裔黑人患者的死亡率低于非西班牙裔白人患者。这些现实世界中的差距反映在用来为缓解这种不平等现象的决策提供信息的数据中。如果不考虑这些观察所依据的固有偏差数据，可能会加剧种族癌症差异，并导致误导性的努力，无法适当解决健康不平等的真正原因：目的：评估机器学习模型在预测非小细胞肺癌（NSCLC）患者两年生存率和手术治疗建议方面的种族/族裔偏差：利用 2000-2018 年间确诊的非小细胞肺癌患者的 SEER 数据，训练了 Cox 生存模型、LOGIT 模型以及其他三种预测手术建议的机器学习模型。模型以 70/30 的训练/测试比例（包括和排除种族/族裔）进行训练，并使用性能和公平性指标进行评估。此外，还评估了对训练数据进行超采样的效果：结果：无论是否使用种族/族裔作为预测因子，生存模型都显示出对非西语裔黑人患者的不同影响。包含种族/族裔的模型扩大了数据中观察到的差异。在生存率和手术建议模型中排除种族/族裔作为预测因子，在不降低模型性能的情况下改善了公平性指标。分层过度采样策略在降低模型准确性的同时也减少了差异影响：结论：NSCLC 的差异是复杂和多方面的。然而，即使考虑到诊断时的年龄和分期，非西班牙裔黑人 NSCLC 患者接受手术治疗的推荐率也低于非西班牙裔白人患者。机器学习模型扩大了整个癌症治疗过程中的种族/民族差异（这些差异反映在用于做出模型决策的数据中）。排除种族/族裔因素会降低模型的偏差，但不会影响差异影响。开发提高公平性的分析策略反过来也会提高机器学习方法分析基于人群的癌症数据的实用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Health Care Management Science HEALTH POLICY & SERVICES-

CiteScore

7.20

自引率

5.60%

发文量

期刊介绍： Health Care Management Science publishes papers dealing with health care delivery, health care management, and health care policy. Papers should have a decision focus and make use of quantitative methods including management science, operations research, analytics, machine learning, and other emerging areas. Articles must clearly articulate the relevance and the realized or potential impact of the work. Applied research will be considered and is of particular interest if there is evidence that it was implemented or informed a decision-making process. Papers describing routine applications of known methods are discouraged. Authors are encouraged to disclose all data and analyses thereof, and to provide computational code when appropriate. Editorial statements for the individual departments are provided below. Health Care Analytics Departmental Editors: Margrét Bjarnadóttir, University of Maryland Nan Kong, Purdue University With the explosion in computing power and available data, we have seen fast changes in the analytics applied in the healthcare space. The Health Care Analytics department welcomes papers applying a broad range of analytical approaches, including those rooted in machine learning, survival analysis, and complex event analysis, that allow healthcare professionals to find opportunities for improvement in health system management, patient engagement, spending, and diagnosis. We especially encourage papers that combine predictive and prescriptive analytics to improve decision making and health care outcomes. The contribution of papers can be across multiple dimensions including new methodology, novel modeling techniques and health care through real-world cohort studies. Papers that are methodologically focused need in addition to show practical relevance. Similarly papers that are application focused should clearly demonstrate improvements over the status quo and available approaches by applying rigorous analytics. Health Care Operations Management Departmental Editors: Nilay Tanik Argon, University of North Carolina at Chapel Hill Bob Batt, University of Wisconsin The department invites high-quality papers on the design, control, and analysis of operations at healthcare systems. We seek papers on classical operations management issues (such as scheduling, routing, queuing, transportation, patient flow, and quality) as well as non-traditional problems driven by everchanging healthcare practice. Empirical, experimental, and analytical (model based) methodologies are all welcome. Papers may draw theory from across disciplines, and should provide insight into improving operations from the perspective of patients, service providers, organizations (municipal/government/industry), and/or society. Health Care Management Science Practice Departmental Editor: Vikram Tiwari, Vanderbilt University Medical Center The department seeks research from academicians and practitioners that highlights Management Science based solutions directly relevant to the practice of healthcare. Relevance is judged by the impact on practice, as well as the degree to which researchers engaged with practitioners in understanding the problem context and in developing the solution. Validity, that is, the extent to which the results presented do or would apply in practice is a key evaluation criterion. In addition to meeting the journal’s standards of originality and substantial contribution to knowledge creation, research that can be replicated in other organizations is encouraged. Papers describing unsuccessful applied research projects may be considered if there are generalizable learning points addressing why the project was unsuccessful. Health Care Productivity Analysis Departmental Editor: Jonas Schreyögg, University of Hamburg The department invites papers with rigorous methods and significant impact for policy and practice. Papers typically apply theory and techniques to measuring productivity in health care organizations and systems. The journal welcomes state-of-the-art parametric as well as non-parametric techniques such as data envelopment analysis, stochastic frontier analysis or partial frontier analysis. The contribution of papers can be manifold including new methodology, novel combination of existing methods or application of existing methods to new contexts. Empirical papers should produce results generalizable beyond a selected set of health care organizations. All papers should include a section on implications for management or policy to enhance productivity. Public Health Policy and Medical Decision Making Departmental Editors: Ebru Bish, University of Alabama Julie L. Higle, University of Southern California The department invites high quality papers that use data-driven methods to address important problems that arise in public health policy and medical decision-making domains. We welcome submissions that develop and apply mathematical and computational models in support of data-driven and model-based analyses for these problems. The Public Health Policy and Medical Decision-Making Department is particularly interested in papers that: Study high-impact problems involving health policy, treatment planning and design, and clinical applications; Develop original data-driven models, including those that integrate disease modeling with screening and/or treatment guidelines; Use model-based analyses as decision making-tools to identify optimal solutions, insights, recommendations. Articles must clearly articulate the relevance of the work to decision and/or policy makers and the potential impact on patients and/or society. Papers will include articulated contributions within the methodological domain, which may include modeling, analytical, or computational methodologies. Emerging Topics Departmental Editor: Alec Morton, University of Strathclyde Emerging Topics will handle papers which use innovative quantitative methods to shed light on frontier issues in healthcare management and policy. Such papers may deal with analytic challenges arising from novel health technologies or new organizational forms. Papers falling under this department may also deal with the analysis of new forms of data which are increasingly captured as health systems become more and more digitized.