Mary M Lucas, Mario Schootman, Jonathan A Laryea, Sonia T Orcutt, Chenghui Li, Jun Ying, Jennifer A Rumpel, Christopher C Yang
{"title":"确定结直肠癌切除术后再入院高风险患者的预测模型偏差。","authors":"Mary M Lucas, Mario Schootman, Jonathan A Laryea, Sonia T Orcutt, Chenghui Li, Jun Ying, Jennifer A Rumpel, Christopher C Yang","doi":"10.1200/CCI.23.00194","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Machine learning algorithms are used for predictive modeling in medicine, but studies often do not evaluate or report on the potential biases of the models. Our purpose was to develop clinical prediction models for readmission after surgery in colorectal cancer (CRC) patients and to examine their potential for racial bias.</p><p><strong>Methods: </strong>We used the 2012-2020 American College of Surgeons' National Surgical Quality Improvement Program (ACS-NSQIP) Participant Use File and Targeted Colectomy File. Patients were categorized into four race groups - White, Black or African American, Other, and Unknown/Not Reported. Potential predictive features were identified from studies of risk factors of 30-day readmission in CRC patients. We compared four machine learning-based methods - logistic regression (LR), multilayer perceptron (MLP), random forest (RF), and XGBoost (XGB). Model bias was assessed using false negative rate (FNR) difference, false positive rate (FPR) difference, and disparate impact.</p><p><strong>Results: </strong>In all, 112,077 patients were included, 67.2% of whom were White, 9.2% Black, 5.6% Other race, and 18% with race not recorded. There were significant differences in the AUROC, FPR and FNR between race groups across all models. Notably, patients in the 'Other' race category had higher FNR compared to Black patients in all but the XGB model, while Black patients had higher FPR than White patients in some models. Patients in the 'Other' category consistently had the lowest FPR. Applying the 80% rule for disparate impact, the models consistently met the threshold for unfairness for the 'Other' race category.</p><p><strong>Conclusion: </strong>Predictive models for 30-day readmission after colorectal surgery may perform unequally for different race groups, potentially propagating to inequalities in delivery of care and patient outcomes if the predictions from these models are used to direct care.</p>","PeriodicalId":51626,"journal":{"name":"JCO Clinical Cancer Informatics","volume":"8 ","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11741203/pdf/","citationCount":"0","resultStr":"{\"title\":\"Bias in Prediction Models to Identify Patients With Colorectal Cancer at High Risk for Readmission After Resection.\",\"authors\":\"Mary M Lucas, Mario Schootman, Jonathan A Laryea, Sonia T Orcutt, Chenghui Li, Jun Ying, Jennifer A Rumpel, Christopher C Yang\",\"doi\":\"10.1200/CCI.23.00194\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>Machine learning algorithms are used for predictive modeling in medicine, but studies often do not evaluate or report on the potential biases of the models. Our purpose was to develop clinical prediction models for readmission after surgery in colorectal cancer (CRC) patients and to examine their potential for racial bias.</p><p><strong>Methods: </strong>We used the 2012-2020 American College of Surgeons' National Surgical Quality Improvement Program (ACS-NSQIP) Participant Use File and Targeted Colectomy File. Patients were categorized into four race groups - White, Black or African American, Other, and Unknown/Not Reported. Potential predictive features were identified from studies of risk factors of 30-day readmission in CRC patients. We compared four machine learning-based methods - logistic regression (LR), multilayer perceptron (MLP), random forest (RF), and XGBoost (XGB). Model bias was assessed using false negative rate (FNR) difference, false positive rate (FPR) difference, and disparate impact.</p><p><strong>Results: </strong>In all, 112,077 patients were included, 67.2% of whom were White, 9.2% Black, 5.6% Other race, and 18% with race not recorded. There were significant differences in the AUROC, FPR and FNR between race groups across all models. Notably, patients in the 'Other' race category had higher FNR compared to Black patients in all but the XGB model, while Black patients had higher FPR than White patients in some models. Patients in the 'Other' category consistently had the lowest FPR. Applying the 80% rule for disparate impact, the models consistently met the threshold for unfairness for the 'Other' race category.</p><p><strong>Conclusion: </strong>Predictive models for 30-day readmission after colorectal surgery may perform unequally for different race groups, potentially propagating to inequalities in delivery of care and patient outcomes if the predictions from these models are used to direct care.</p>\",\"PeriodicalId\":51626,\"journal\":{\"name\":\"JCO Clinical Cancer Informatics\",\"volume\":\"8 \",\"pages\":\"\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11741203/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JCO Clinical Cancer Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1200/CCI.23.00194\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/10/9 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JCO Clinical Cancer Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1200/CCI.23.00194","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/9 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
Bias in Prediction Models to Identify Patients With Colorectal Cancer at High Risk for Readmission After Resection.
Purpose: Machine learning algorithms are used for predictive modeling in medicine, but studies often do not evaluate or report on the potential biases of the models. Our purpose was to develop clinical prediction models for readmission after surgery in colorectal cancer (CRC) patients and to examine their potential for racial bias.
Methods: We used the 2012-2020 American College of Surgeons' National Surgical Quality Improvement Program (ACS-NSQIP) Participant Use File and Targeted Colectomy File. Patients were categorized into four race groups - White, Black or African American, Other, and Unknown/Not Reported. Potential predictive features were identified from studies of risk factors of 30-day readmission in CRC patients. We compared four machine learning-based methods - logistic regression (LR), multilayer perceptron (MLP), random forest (RF), and XGBoost (XGB). Model bias was assessed using false negative rate (FNR) difference, false positive rate (FPR) difference, and disparate impact.
Results: In all, 112,077 patients were included, 67.2% of whom were White, 9.2% Black, 5.6% Other race, and 18% with race not recorded. There were significant differences in the AUROC, FPR and FNR between race groups across all models. Notably, patients in the 'Other' race category had higher FNR compared to Black patients in all but the XGB model, while Black patients had higher FPR than White patients in some models. Patients in the 'Other' category consistently had the lowest FPR. Applying the 80% rule for disparate impact, the models consistently met the threshold for unfairness for the 'Other' race category.
Conclusion: Predictive models for 30-day readmission after colorectal surgery may perform unequally for different race groups, potentially propagating to inequalities in delivery of care and patient outcomes if the predictions from these models are used to direct care.