Emily F Wong, Anil K Saini, Eynav E Accortt, Melissa S Wong, Jason H Moore, Tiffani J Bright
{"title":"Evaluating Bias-Mitigated Predictive Models of Perinatal Mood and Anxiety Disorders.","authors":"Emily F Wong, Anil K Saini, Eynav E Accortt, Melissa S Wong, Jason H Moore, Tiffani J Bright","doi":"10.1001/jamanetworkopen.2024.38152","DOIUrl":null,"url":null,"abstract":"<p><strong>Importance: </strong>Machine learning for augmented screening of perinatal mood and anxiety disorders (PMADs) requires thorough consideration of clinical biases embedded in electronic health records (EHRs) and rigorous evaluations of model performance.</p><p><strong>Objective: </strong>To mitigate bias in predictive models of PMADs trained on commonly available EHRs.</p><p><strong>Design, setting, and participants: </strong>This diagnostic study collected data as part of a quality improvement initiative from 2020 to 2023 at Cedars-Sinai Medical Center in Los Angeles, California. The study inclusion criteria were birthing patients aged 14 to 59 years with live birth records and admission to the postpartum unit or the maternal-fetal care unit after delivery.</p><p><strong>Exposure: </strong>Patient-reported race and ethnicity (7 levels) obtained through EHRs.</p><p><strong>Main outcomes and measures: </strong>Logistic regression, random forest, and extreme gradient boosting models were trained to predict 2 binary outcomes: moderate to high-risk (positive) screen assessed using the 9-item Patient Health Questionnaire (PHQ-9), and the Edinburgh Postnatal Depression Scale (EPDS). Each model was fitted with or without reweighing data during preprocessing and evaluated through repeated K-fold cross validation. In every iteration, each model was evaluated on its area under the receiver operating curve (AUROC) and on 2 fairness metrics: demographic parity (DP), and difference in false negatives between races and ethnicities (relative to non-Hispanic White patients).</p><p><strong>Results: </strong>Among 19 430 patients in this study, 1402 (7%) identified as African American or Black, 2371 (12%) as Asian American and Pacific Islander; 1842 (10%) as Hispanic White, 10 942 (56.3%) as non-Hispanic White, 606 (3%) as multiple races, 2146 (11%) as other (not further specified), and 121 (<1%) did not provide this information. The mean (SD) age was 34.1 (4.9) years, and all patients identified as female. Racial and ethnic minority patients were significantly more likely than non-Hispanic White patients to screen positive on both the PHQ-9 (odds ratio, 1.47 [95% CI, 1.23-1.77]) and the EPDS (odds ratio, 1.38 [95% CI, 1.20-1.57]). Mean AUROCs ranged from 0.610 to 0.635 without reweighing (baseline), and from 0.602 to 0.622 with reweighing. Baseline models predicted significantly greater prevalence of postpartum depression for patients who were not non-Hispanic White relative to those who were (mean DP, 0.238 [95% CI, 0.231-0.244]; P < .001) and displayed significantly lower false-negative rates (mean difference, -0.184 [95% CI, -0.195 to -0.174]; P < .001). Reweighing significantly reduced differences in DP (mean DP with reweighing, 0.022 [95% CI, 0.017-0.026]; P < .001) and false-negative rates (mean difference with reweighing, 0.018 [95% CI, 0.008-0.028]; P < .001) between racial and ethnic groups.</p><p><strong>Conclusions and relevance: </strong>In this diagnostic study of predictive models of postpartum depression, clinical prediction models trained to predict psychometric screening results from commonly available EHRs achieved modest performance and were less likely to widen existing health disparities in PMAD diagnosis and potentially treatment. These findings suggest that is critical for researchers and physicians to consider their model design (eg, desired target and predictor variables) and evaluate model bias to minimize health disparities.</p>","PeriodicalId":14694,"journal":{"name":"JAMA Network Open","volume":"7 12","pages":"e2438152"},"PeriodicalIF":10.5000,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11615713/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JAMA Network Open","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1001/jamanetworkopen.2024.38152","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, GENERAL & INTERNAL","Score":null,"Total":0}
引用次数: 0
Abstract
Importance: Machine learning for augmented screening of perinatal mood and anxiety disorders (PMADs) requires thorough consideration of clinical biases embedded in electronic health records (EHRs) and rigorous evaluations of model performance.
Objective: To mitigate bias in predictive models of PMADs trained on commonly available EHRs.
Design, setting, and participants: This diagnostic study collected data as part of a quality improvement initiative from 2020 to 2023 at Cedars-Sinai Medical Center in Los Angeles, California. The study inclusion criteria were birthing patients aged 14 to 59 years with live birth records and admission to the postpartum unit or the maternal-fetal care unit after delivery.
Exposure: Patient-reported race and ethnicity (7 levels) obtained through EHRs.
Main outcomes and measures: Logistic regression, random forest, and extreme gradient boosting models were trained to predict 2 binary outcomes: moderate to high-risk (positive) screen assessed using the 9-item Patient Health Questionnaire (PHQ-9), and the Edinburgh Postnatal Depression Scale (EPDS). Each model was fitted with or without reweighing data during preprocessing and evaluated through repeated K-fold cross validation. In every iteration, each model was evaluated on its area under the receiver operating curve (AUROC) and on 2 fairness metrics: demographic parity (DP), and difference in false negatives between races and ethnicities (relative to non-Hispanic White patients).
Results: Among 19 430 patients in this study, 1402 (7%) identified as African American or Black, 2371 (12%) as Asian American and Pacific Islander; 1842 (10%) as Hispanic White, 10 942 (56.3%) as non-Hispanic White, 606 (3%) as multiple races, 2146 (11%) as other (not further specified), and 121 (<1%) did not provide this information. The mean (SD) age was 34.1 (4.9) years, and all patients identified as female. Racial and ethnic minority patients were significantly more likely than non-Hispanic White patients to screen positive on both the PHQ-9 (odds ratio, 1.47 [95% CI, 1.23-1.77]) and the EPDS (odds ratio, 1.38 [95% CI, 1.20-1.57]). Mean AUROCs ranged from 0.610 to 0.635 without reweighing (baseline), and from 0.602 to 0.622 with reweighing. Baseline models predicted significantly greater prevalence of postpartum depression for patients who were not non-Hispanic White relative to those who were (mean DP, 0.238 [95% CI, 0.231-0.244]; P < .001) and displayed significantly lower false-negative rates (mean difference, -0.184 [95% CI, -0.195 to -0.174]; P < .001). Reweighing significantly reduced differences in DP (mean DP with reweighing, 0.022 [95% CI, 0.017-0.026]; P < .001) and false-negative rates (mean difference with reweighing, 0.018 [95% CI, 0.008-0.028]; P < .001) between racial and ethnic groups.
Conclusions and relevance: In this diagnostic study of predictive models of postpartum depression, clinical prediction models trained to predict psychometric screening results from commonly available EHRs achieved modest performance and were less likely to widen existing health disparities in PMAD diagnosis and potentially treatment. These findings suggest that is critical for researchers and physicians to consider their model design (eg, desired target and predictor variables) and evaluate model bias to minimize health disparities.
期刊介绍:
JAMA Network Open, a member of the esteemed JAMA Network, stands as an international, peer-reviewed, open-access general medical journal.The publication is dedicated to disseminating research across various health disciplines and countries, encompassing clinical care, innovation in health care, health policy, and global health.
JAMA Network Open caters to clinicians, investigators, and policymakers, providing a platform for valuable insights and advancements in the medical field. As part of the JAMA Network, a consortium of peer-reviewed general medical and specialty publications, JAMA Network Open contributes to the collective knowledge and understanding within the medical community.