Stephanie Teeple, Aria G. Smith, Matthew F. Toerper, Scott Levin, Scott Halpern, Oluwakemi Badaki‐Makun, J. Hinson
{"title":"探索遗漏对急诊科分诊机器学习模型预测性能种族差异的影响","authors":"Stephanie Teeple, Aria G. Smith, Matthew F. Toerper, Scott Levin, Scott Halpern, Oluwakemi Badaki‐Makun, J. Hinson","doi":"10.1093/jamiaopen/ooad107","DOIUrl":null,"url":null,"abstract":"To investigate how missing data in the patient problem list may impact racial disparities in the predictive performance of a machine learning (ML) model for emergency department (ED) triage. Racial disparities may exist in the missingness of EHR data (eg, systematic differences in access, testing, and/or treatment) that can impact model predictions across racialized patient groups. We use an ML model that predicts patients’ risk for adverse events to produce triage-level recommendations, patterned after a clinical decision support tool deployed at multiple EDs. We compared the model’s predictive performance on sets of observed (problem list data at the point of triage) versus manipulated (updated to the more complete problem list at the end of the encounter) test data. These differences were compared between Black and non-Hispanic White patient groups using multiple performance measures relevant to health equity. There were modest, but significant, changes in predictive performance comparing the observed to manipulated models across both Black and non-Hispanic White patient groups; c-statistic improvement ranged between 0.027 and 0.058. The manipulation produced no between-group differences in c-statistic by race. However, there were small between-group differences in other performance measures, with greater change for non-Hispanic White patients. Problem list missingness impacted model performance for both patient groups, with marginal differences detected by race. Further exploration is needed to examine how missingness may contribute to racial disparities in clinical model predictions across settings. The novel manipulation method demonstrated may aid future research.","PeriodicalId":36278,"journal":{"name":"JAMIA Open","volume":"13 1","pages":""},"PeriodicalIF":2.5000,"publicationDate":"2023-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring the impact of missingness on racial disparities in predictive performance of a machine learning model for emergency department triage\",\"authors\":\"Stephanie Teeple, Aria G. Smith, Matthew F. Toerper, Scott Levin, Scott Halpern, Oluwakemi Badaki‐Makun, J. Hinson\",\"doi\":\"10.1093/jamiaopen/ooad107\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To investigate how missing data in the patient problem list may impact racial disparities in the predictive performance of a machine learning (ML) model for emergency department (ED) triage. Racial disparities may exist in the missingness of EHR data (eg, systematic differences in access, testing, and/or treatment) that can impact model predictions across racialized patient groups. We use an ML model that predicts patients’ risk for adverse events to produce triage-level recommendations, patterned after a clinical decision support tool deployed at multiple EDs. We compared the model’s predictive performance on sets of observed (problem list data at the point of triage) versus manipulated (updated to the more complete problem list at the end of the encounter) test data. These differences were compared between Black and non-Hispanic White patient groups using multiple performance measures relevant to health equity. There were modest, but significant, changes in predictive performance comparing the observed to manipulated models across both Black and non-Hispanic White patient groups; c-statistic improvement ranged between 0.027 and 0.058. The manipulation produced no between-group differences in c-statistic by race. However, there were small between-group differences in other performance measures, with greater change for non-Hispanic White patients. Problem list missingness impacted model performance for both patient groups, with marginal differences detected by race. Further exploration is needed to examine how missingness may contribute to racial disparities in clinical model predictions across settings. The novel manipulation method demonstrated may aid future research.\",\"PeriodicalId\":36278,\"journal\":{\"name\":\"JAMIA Open\",\"volume\":\"13 1\",\"pages\":\"\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2023-10-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JAMIA Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/jamiaopen/ooad107\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMIA Open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/jamiaopen/ooad107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
摘要
目的:研究患者问题清单中的缺失数据如何影响急诊科(ED)分诊机器学习(ML)模型预测性能中的种族差异。 电子病历数据的缺失可能存在种族差异(例如,就诊、检测和/或治疗方面的系统性差异),这会影响模型对不同种族患者群体的预测。我们使用了一个预测患者不良事件风险的 ML 模型,以多个急诊室部署的临床决策支持工具为蓝本,提出分诊建议。我们比较了该模型在观察数据集(分诊时的问题列表数据)和操作数据集(就诊结束时更新为更完整的问题列表)上的预测性能。使用与健康公平相关的多种绩效指标,比较了黑人和非西班牙裔白人患者群体之间的差异。 在黑人和非西班牙裔白人患者群体中,将观察到的模型与操作模型进行比较,预测性能发生了适度但显著的变化;c 统计量的提高幅度在 0.027 和 0.058 之间。操纵模型在不同种族的 c 统计量上没有组间差异。但是,在其他绩效指标方面,组间差异较小,非西班牙裔白人患者的变化更大。 问题列表缺失对两组患者的模型性能都有影响,种族间的差异微乎其微。 我们还需要进一步研究遗漏是如何导致不同环境下临床模型预测的种族差异的。所展示的新颖操作方法可能有助于未来的研究。
Exploring the impact of missingness on racial disparities in predictive performance of a machine learning model for emergency department triage
To investigate how missing data in the patient problem list may impact racial disparities in the predictive performance of a machine learning (ML) model for emergency department (ED) triage. Racial disparities may exist in the missingness of EHR data (eg, systematic differences in access, testing, and/or treatment) that can impact model predictions across racialized patient groups. We use an ML model that predicts patients’ risk for adverse events to produce triage-level recommendations, patterned after a clinical decision support tool deployed at multiple EDs. We compared the model’s predictive performance on sets of observed (problem list data at the point of triage) versus manipulated (updated to the more complete problem list at the end of the encounter) test data. These differences were compared between Black and non-Hispanic White patient groups using multiple performance measures relevant to health equity. There were modest, but significant, changes in predictive performance comparing the observed to manipulated models across both Black and non-Hispanic White patient groups; c-statistic improvement ranged between 0.027 and 0.058. The manipulation produced no between-group differences in c-statistic by race. However, there were small between-group differences in other performance measures, with greater change for non-Hispanic White patients. Problem list missingness impacted model performance for both patient groups, with marginal differences detected by race. Further exploration is needed to examine how missingness may contribute to racial disparities in clinical model predictions across settings. The novel manipulation method demonstrated may aid future research.