Predicting postoperative chronic opioid use with fair machine learning models integrating multi-modal data sources: a demonstration of ethical machine learning in healthcare.
IF 4.7 2区 医学Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Nidhi Soley, Ilia Rattsev, Traci J Speed, Anping Xie, Kadija S Ferryman, Casey Overby Taylor
{"title":"Predicting postoperative chronic opioid use with fair machine learning models integrating multi-modal data sources: a demonstration of ethical machine learning in healthcare.","authors":"Nidhi Soley, Ilia Rattsev, Traci J Speed, Anping Xie, Kadija S Ferryman, Casey Overby Taylor","doi":"10.1093/jamia/ocaf053","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>Building upon our previous work on predicting chronic opioid use using electronic health records (EHR) and wearable data, this study leveraged the Health Equity Across the AI Lifecycle (HEAAL) framework to (a) fine tune the previously built model with genomic data and evaluate model performance in predicting chronic opioid use and (b) apply IBM's AIF360 pre-processing toolkit to mitigate bias related to gender and race and evaluate the model performance using various fairness metrics.</p><p><strong>Materials and methods: </strong>Participants included approximately 271 All of Us Research Program subjects with EHR, wearable, and genomic data. We fine-tuned 4 machine learning models on the new dataset. The SHapley Additive exPlanations (SHAP) technique identified the best-performing predictors. A preprocessing toolkit boosted fairness by gender and race.</p><p><strong>Results: </strong>The genetic data enhanced model performance from the prior model, with the area under the curve improving from 0.90 (95% CI, 0.88-0.92) to 0.95 (95% CI, 0.89-0.95). Key predictors included Dopamine D1 Receptor (DRD1) rs4532, general type of surgery, and time spent in physical activity. The reweighing preprocessing technique applied to the stacking algorithm effectively improved the model's fairness across racial and gender groups without compromising performance.</p><p><strong>Conclusion: </strong>We leveraged 2 dimensions of the HEAAL framework to build a fair artificial intelligence (AI) solution. Multi-modal datasets (including wearable and genetic data) and applying bias mitigation strategies can help models to more fairly and accurately assess risk across diverse populations, promoting fairness in AI in healthcare.</p>","PeriodicalId":50016,"journal":{"name":"Journal of the American Medical Informatics Association","volume":" ","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the American Medical Informatics Association","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1093/jamia/ocaf053","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Objective: Building upon our previous work on predicting chronic opioid use using electronic health records (EHR) and wearable data, this study leveraged the Health Equity Across the AI Lifecycle (HEAAL) framework to (a) fine tune the previously built model with genomic data and evaluate model performance in predicting chronic opioid use and (b) apply IBM's AIF360 pre-processing toolkit to mitigate bias related to gender and race and evaluate the model performance using various fairness metrics.
Materials and methods: Participants included approximately 271 All of Us Research Program subjects with EHR, wearable, and genomic data. We fine-tuned 4 machine learning models on the new dataset. The SHapley Additive exPlanations (SHAP) technique identified the best-performing predictors. A preprocessing toolkit boosted fairness by gender and race.
Results: The genetic data enhanced model performance from the prior model, with the area under the curve improving from 0.90 (95% CI, 0.88-0.92) to 0.95 (95% CI, 0.89-0.95). Key predictors included Dopamine D1 Receptor (DRD1) rs4532, general type of surgery, and time spent in physical activity. The reweighing preprocessing technique applied to the stacking algorithm effectively improved the model's fairness across racial and gender groups without compromising performance.
Conclusion: We leveraged 2 dimensions of the HEAAL framework to build a fair artificial intelligence (AI) solution. Multi-modal datasets (including wearable and genetic data) and applying bias mitigation strategies can help models to more fairly and accurately assess risk across diverse populations, promoting fairness in AI in healthcare.
期刊介绍:
JAMIA is AMIA''s premier peer-reviewed journal for biomedical and health informatics. Covering the full spectrum of activities in the field, JAMIA includes informatics articles in the areas of clinical care, clinical research, translational science, implementation science, imaging, education, consumer health, public health, and policy. JAMIA''s articles describe innovative informatics research and systems that help to advance biomedical science and to promote health. Case reports, perspectives and reviews also help readers stay connected with the most important informatics developments in implementation, policy and education.