急诊入院预测模型的交叉点和边际去偏。

IF 10.5 1区 医学 Q1 MEDICINE, GENERAL & INTERNAL
Elle Lett, Shakiba Shahbandegan, Yuval Barak-Corren, Andrew M Fine, William G La Cava
{"title":"急诊入院预测模型的交叉点和边际去偏。","authors":"Elle Lett, Shakiba Shahbandegan, Yuval Barak-Corren, Andrew M Fine, William G La Cava","doi":"10.1001/jamanetworkopen.2025.12947","DOIUrl":null,"url":null,"abstract":"<p><strong>Importance: </strong>Fair clinical prediction models are crucial for achieving equitable health outcomes. Intersectionality has been applied to develop algorithms that address discrimination among intersections of protected attributes (eg, Black women rather than Black persons or women separately), yet most fair algorithms default to marginal debiasing, optimizing performance across simplified patient subgroups.</p><p><strong>Objective: </strong>To assess the extent to which simplifying patient subgroups during training is associated with intersectional subgroup performance in emergency department (ED) admission models.</p><p><strong>Design, setting, and participants: </strong>This prognostic study of admission prediction models used retrospective data from ED visits to Beth Israel Deaconess Medical Center Medical Information Mart for Intensive Care IV (MIMIC-IV; n = 160 016) from January 1, 2011, to December 31, 2019, and Boston Children's Hospital (BCH; n = 22 222) from June 1 through August 13, 2019. Statistical analysis was conducted from January 2022 to August 2024.</p><p><strong>Main outcomes and measures: </strong>The primary outcome was admission to an in-patient service. The accuracy of admission predictions among intersectional subgroups was measured under variations on model training with respect to optimizing for group level performance. Under different fairness definitions (calibration, error rate balance) and modeling methods (linear, nonlinear), overall performance and subgroup performance of marginal debiasing approaches were compared with intersectional debiasing approaches. Subgroups were defined by self-reported race and ethnicity and gender. Measures include area under the receiver operator characteristic curve (AUROC), area under the precision recall curve, subgroup calibration error, and false-negative rates.</p><p><strong>Results: </strong>The MIMIC-IV cohort included 160 016 visits (mean [SD] age, 53.0 [19.3] years; 57.4% female patients; 0.3% American Indian or Alaska Native patients, 3.7% Asian patients, 26.2% Black patients, 10.0% Hispanic or Latino patients, and 59.7% White patients; 29.5% admitted) and the BCH cohort included 22 222 visits (mean [SD] age, 8.2 [6.8] years; 52.1% male patients; 0.1% American Indian or Alaska Native patients, 4.0% Asian patients, 19.7% Black patients, 30.6% Hispanic or Latino patients, 0.2% Native Hawaiian or Pacific Islander patients, 37.7% White patients; 16.3% admitted). Among MIMIC-IV groups, intersectional debiasing was associated with a reduced subgroup calibration error from 0.083 to 0.065 (22.3%), while marginal fairness debiasing was associated with a reduced subgroup calibration error from 0.083 to 0.074 (11.3%; difference, 11.1%); among BCH groups, intersectional debiasing was associated with a reduced subgroup calibration error from 0.111 to 0.080 (28.3%), while marginal fairness debiasing was associated with a reduced subgroup calibration error from 0.111 to 0.086 (22.6%; difference, 5.7%). Among MIMIC-IV groups, intersectional debiasing was associated with lowered subgroup false-negative rates from 0.142 to 0.125 (11.9%), while marginal debiasing was associated with lowered subgroup false-negative rates from 0.142 to 0.132 (6.8%; difference, 5.1%). Fairness improvements did not decrease overall accuracy compared with baseline models (eg, MIMIC-IV: mean [SD] AUROC, 0.85 [0.00], both models). Intersectional debiasing was associated with lowered error rates in several intersectional subpopulations compared with other strategies.</p><p><strong>Conclusions and relevance: </strong>This study suggests that intersectional debiasing better mitigates performance disparities across intersecting groups than marginal debiasing for admission prediction. Intersectionally debiased models were associated with reduced group-specific errors without compromising overall accuracy. Clinical risk prediction models should consider incorporating intersectional debiasing into their development.</p>","PeriodicalId":14694,"journal":{"name":"JAMA Network Open","volume":"8 5","pages":"e2512947"},"PeriodicalIF":10.5000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12123471/pdf/","citationCount":"0","resultStr":"{\"title\":\"Intersectional and Marginal Debiasing in Prediction Models for Emergency Admissions.\",\"authors\":\"Elle Lett, Shakiba Shahbandegan, Yuval Barak-Corren, Andrew M Fine, William G La Cava\",\"doi\":\"10.1001/jamanetworkopen.2025.12947\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Importance: </strong>Fair clinical prediction models are crucial for achieving equitable health outcomes. Intersectionality has been applied to develop algorithms that address discrimination among intersections of protected attributes (eg, Black women rather than Black persons or women separately), yet most fair algorithms default to marginal debiasing, optimizing performance across simplified patient subgroups.</p><p><strong>Objective: </strong>To assess the extent to which simplifying patient subgroups during training is associated with intersectional subgroup performance in emergency department (ED) admission models.</p><p><strong>Design, setting, and participants: </strong>This prognostic study of admission prediction models used retrospective data from ED visits to Beth Israel Deaconess Medical Center Medical Information Mart for Intensive Care IV (MIMIC-IV; n = 160 016) from January 1, 2011, to December 31, 2019, and Boston Children's Hospital (BCH; n = 22 222) from June 1 through August 13, 2019. Statistical analysis was conducted from January 2022 to August 2024.</p><p><strong>Main outcomes and measures: </strong>The primary outcome was admission to an in-patient service. The accuracy of admission predictions among intersectional subgroups was measured under variations on model training with respect to optimizing for group level performance. Under different fairness definitions (calibration, error rate balance) and modeling methods (linear, nonlinear), overall performance and subgroup performance of marginal debiasing approaches were compared with intersectional debiasing approaches. Subgroups were defined by self-reported race and ethnicity and gender. Measures include area under the receiver operator characteristic curve (AUROC), area under the precision recall curve, subgroup calibration error, and false-negative rates.</p><p><strong>Results: </strong>The MIMIC-IV cohort included 160 016 visits (mean [SD] age, 53.0 [19.3] years; 57.4% female patients; 0.3% American Indian or Alaska Native patients, 3.7% Asian patients, 26.2% Black patients, 10.0% Hispanic or Latino patients, and 59.7% White patients; 29.5% admitted) and the BCH cohort included 22 222 visits (mean [SD] age, 8.2 [6.8] years; 52.1% male patients; 0.1% American Indian or Alaska Native patients, 4.0% Asian patients, 19.7% Black patients, 30.6% Hispanic or Latino patients, 0.2% Native Hawaiian or Pacific Islander patients, 37.7% White patients; 16.3% admitted). Among MIMIC-IV groups, intersectional debiasing was associated with a reduced subgroup calibration error from 0.083 to 0.065 (22.3%), while marginal fairness debiasing was associated with a reduced subgroup calibration error from 0.083 to 0.074 (11.3%; difference, 11.1%); among BCH groups, intersectional debiasing was associated with a reduced subgroup calibration error from 0.111 to 0.080 (28.3%), while marginal fairness debiasing was associated with a reduced subgroup calibration error from 0.111 to 0.086 (22.6%; difference, 5.7%). Among MIMIC-IV groups, intersectional debiasing was associated with lowered subgroup false-negative rates from 0.142 to 0.125 (11.9%), while marginal debiasing was associated with lowered subgroup false-negative rates from 0.142 to 0.132 (6.8%; difference, 5.1%). Fairness improvements did not decrease overall accuracy compared with baseline models (eg, MIMIC-IV: mean [SD] AUROC, 0.85 [0.00], both models). Intersectional debiasing was associated with lowered error rates in several intersectional subpopulations compared with other strategies.</p><p><strong>Conclusions and relevance: </strong>This study suggests that intersectional debiasing better mitigates performance disparities across intersecting groups than marginal debiasing for admission prediction. Intersectionally debiased models were associated with reduced group-specific errors without compromising overall accuracy. Clinical risk prediction models should consider incorporating intersectional debiasing into their development.</p>\",\"PeriodicalId\":14694,\"journal\":{\"name\":\"JAMA Network Open\",\"volume\":\"8 5\",\"pages\":\"e2512947\"},\"PeriodicalIF\":10.5000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12123471/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JAMA Network Open\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1001/jamanetworkopen.2025.12947\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MEDICINE, GENERAL & INTERNAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMA Network Open","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1001/jamanetworkopen.2025.12947","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0

摘要

重要性:公平的临床预测模型对于实现公平的健康结果至关重要。交叉性已被应用于开发算法,以解决受保护属性交叉点之间的歧视问题(例如,黑人妇女而不是黑人或单独的妇女),但大多数公平算法默认为边缘去偏,优化简化患者亚组的性能。目的:评估在急诊科(ED)入院模型中,培训过程中简化患者亚组与交叉亚组表现的关系程度。设计、设置和参与者:这项入院预测模型的预后研究使用了贝斯以色列女执事医疗中心重症监护医疗信息市场IV (MIMIC-IV;n = 160 016),2011年1月1日至2019年12月31日,波士顿儿童医院(BCH;n = 22 222),从2019年6月1日至8月13日。统计分析时间为2022年1月至2024年8月。主要结局和措施:主要结局是住院服务。交叉亚组的入院预测准确性在模型训练的变化下进行测量,以优化组水平的表现。在不同的公平性定义(校准、错误率平衡)和建模方法(线性、非线性)下,比较了边际去偏方法和交叉去偏方法的整体性能和子群性能。亚组由自我报告的种族、民族和性别来定义。测量方法包括接收算子特征曲线下面积(AUROC)、精确召回曲线下面积、亚组校准误差和假阴性率。结果:MIMIC-IV队列包括160 016次访问(平均[SD]年龄,53.0[19.3]岁;女性患者占57.4%;0.3%的美国印第安人或阿拉斯加土著患者,3.7%的亚洲患者,26.2%的黑人患者,10.0%的西班牙裔或拉丁裔患者,59.7%的白人患者;29.5%入院),BCH队列包括22 222次就诊(平均[SD]年龄8.2[6.8]岁;男性占52.1%;0.1%美国印第安人或阿拉斯加原住民患者,4.0%亚洲患者,19.7%黑人患者,30.6%西班牙裔或拉丁裔患者,0.2%夏威夷原住民或太平洋岛民患者,37.7%白人患者;16.3%的人承认)。在MIMIC-IV组中,交叉去偏与亚组校准误差从0.083降低到0.065(22.3%)相关,而边际公平去偏与亚组校准误差从0.083降低到0.074 (11.3%;差异,11.1%);在BCH组中,交叉性去偏可将亚组校准误差从0.111降低到0.080(28.3%),而边际公平性去偏可将亚组校准误差从0.111降低到0.086 (22.6%);差异,5.7%)。在MIMIC-IV组中,交叉去偏与亚组假阴性率从0.142降至0.125(11.9%)相关,而边缘去偏与亚组假阴性率从0.142降至0.132 (6.8%;差异,5.1%)。与基线模型相比,公平性改善并未降低总体准确性(例如,MIMIC-IV: mean [SD] AUROC,两种模型均为0.85[0.00])。与其他策略相比,交叉去偏与几个交叉亚群的错误率降低有关。结论和相关性:本研究表明,交叉去偏比边际去偏更能缓解交叉组之间的成绩差异。交叉去偏模型与减少群体特定误差相关,而不影响整体准确性。临床风险预测模型应考虑将交叉去偏纳入其发展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Intersectional and Marginal Debiasing in Prediction Models for Emergency Admissions.

Importance: Fair clinical prediction models are crucial for achieving equitable health outcomes. Intersectionality has been applied to develop algorithms that address discrimination among intersections of protected attributes (eg, Black women rather than Black persons or women separately), yet most fair algorithms default to marginal debiasing, optimizing performance across simplified patient subgroups.

Objective: To assess the extent to which simplifying patient subgroups during training is associated with intersectional subgroup performance in emergency department (ED) admission models.

Design, setting, and participants: This prognostic study of admission prediction models used retrospective data from ED visits to Beth Israel Deaconess Medical Center Medical Information Mart for Intensive Care IV (MIMIC-IV; n = 160 016) from January 1, 2011, to December 31, 2019, and Boston Children's Hospital (BCH; n = 22 222) from June 1 through August 13, 2019. Statistical analysis was conducted from January 2022 to August 2024.

Main outcomes and measures: The primary outcome was admission to an in-patient service. The accuracy of admission predictions among intersectional subgroups was measured under variations on model training with respect to optimizing for group level performance. Under different fairness definitions (calibration, error rate balance) and modeling methods (linear, nonlinear), overall performance and subgroup performance of marginal debiasing approaches were compared with intersectional debiasing approaches. Subgroups were defined by self-reported race and ethnicity and gender. Measures include area under the receiver operator characteristic curve (AUROC), area under the precision recall curve, subgroup calibration error, and false-negative rates.

Results: The MIMIC-IV cohort included 160 016 visits (mean [SD] age, 53.0 [19.3] years; 57.4% female patients; 0.3% American Indian or Alaska Native patients, 3.7% Asian patients, 26.2% Black patients, 10.0% Hispanic or Latino patients, and 59.7% White patients; 29.5% admitted) and the BCH cohort included 22 222 visits (mean [SD] age, 8.2 [6.8] years; 52.1% male patients; 0.1% American Indian or Alaska Native patients, 4.0% Asian patients, 19.7% Black patients, 30.6% Hispanic or Latino patients, 0.2% Native Hawaiian or Pacific Islander patients, 37.7% White patients; 16.3% admitted). Among MIMIC-IV groups, intersectional debiasing was associated with a reduced subgroup calibration error from 0.083 to 0.065 (22.3%), while marginal fairness debiasing was associated with a reduced subgroup calibration error from 0.083 to 0.074 (11.3%; difference, 11.1%); among BCH groups, intersectional debiasing was associated with a reduced subgroup calibration error from 0.111 to 0.080 (28.3%), while marginal fairness debiasing was associated with a reduced subgroup calibration error from 0.111 to 0.086 (22.6%; difference, 5.7%). Among MIMIC-IV groups, intersectional debiasing was associated with lowered subgroup false-negative rates from 0.142 to 0.125 (11.9%), while marginal debiasing was associated with lowered subgroup false-negative rates from 0.142 to 0.132 (6.8%; difference, 5.1%). Fairness improvements did not decrease overall accuracy compared with baseline models (eg, MIMIC-IV: mean [SD] AUROC, 0.85 [0.00], both models). Intersectional debiasing was associated with lowered error rates in several intersectional subpopulations compared with other strategies.

Conclusions and relevance: This study suggests that intersectional debiasing better mitigates performance disparities across intersecting groups than marginal debiasing for admission prediction. Intersectionally debiased models were associated with reduced group-specific errors without compromising overall accuracy. Clinical risk prediction models should consider incorporating intersectional debiasing into their development.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
JAMA Network Open
JAMA Network Open Medicine-General Medicine
CiteScore
16.00
自引率
2.90%
发文量
2126
审稿时长
16 weeks
期刊介绍: JAMA Network Open, a member of the esteemed JAMA Network, stands as an international, peer-reviewed, open-access general medical journal.The publication is dedicated to disseminating research across various health disciplines and countries, encompassing clinical care, innovation in health care, health policy, and global health. JAMA Network Open caters to clinicians, investigators, and policymakers, providing a platform for valuable insights and advancements in the medical field. As part of the JAMA Network, a consortium of peer-reviewed general medical and specialty publications, JAMA Network Open contributes to the collective knowledge and understanding within the medical community.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信