Lillian Rountree, Yi-Ting Lin, Chuyu Liu, Maxwell Salvatore, Andrew Admon, Brahmajee K Nallamothu, Karandeep Singh, Anirban Basu, Fan Bu, Bhramar Mukherjee
{"title":"Reporting of Fairness Metrics in Clinical Risk Prediction Models: A Call for Change to Ensure Equitable Precision Health Benefits for All.","authors":"Lillian Rountree, Yi-Ting Lin, Chuyu Liu, Maxwell Salvatore, Andrew Admon, Brahmajee K Nallamothu, Karandeep Singh, Anirban Basu, Fan Bu, Bhramar Mukherjee","doi":"10.2196/66598","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Clinical risk prediction models integrated in digitized healthcare informatics systems hold promise for personalized primary prevention and care, a core goal of precision health. Fairness metrics are important tools for evaluating potential disparities across sensitive features-such as sex and race/ethnicity-in the field of prediction modeling. However, fairness metric usage in clinical risk prediction models remain infrequent, sporadic and rarely empirically evaluated.</p><p><strong>Objective: </strong>We seek to assess the uptake of fairness metrics in clinical risk prediction modeling through an empirical evaluation of popular prediction models for two diseases, one chronic and one infectious disease.</p><p><strong>Methods: </strong>We conducted a scoping literature review in November 2023 of recent high-impact publications on clinical risk prediction models for cardiovascular disease (CVD) and COVID-19 using Google Scholar.</p><p><strong>Results: </strong>Our review resulted in a shortlist of 23 CVD-focused articles and 22 COVID-19 focused articles. No articles evaluated fairness metrics. Of the CVD articles, 26% used a sex-stratified model, and of those with race/ethnicity data, 92% had data from over 50% from one race/ethnicity. Of the COVID-19 models, 9% used a sex-stratified model, and of those that included race/ethnicity data, 50% had study populations that were more than 50% from one race/ethnicity. No articles for either disease stratified their models by race/ethnicity.</p><p><strong>Conclusions: </strong>Our review shows that the use of fairness metrics for evaluating differences across sensitive features is rare, despite their ability to identify inequality and flag potential gaps in prevention and care. We also find that training data remain largely racially/ethnically homogeneous, demonstrating an urgent need for diversifying study cohorts and data collection. We propose an implementation framework to initiate change, calling for better connections between theory and practice when it comes to adoption of fairness metrics for clinical risk prediction. We hypothesize that this integration will lead to a more equitable prediction world.</p><p><strong>Clinicaltrial: </strong></p>","PeriodicalId":74345,"journal":{"name":"Online journal of public health informatics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Online journal of public health informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/66598","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Clinical risk prediction models integrated in digitized healthcare informatics systems hold promise for personalized primary prevention and care, a core goal of precision health. Fairness metrics are important tools for evaluating potential disparities across sensitive features-such as sex and race/ethnicity-in the field of prediction modeling. However, fairness metric usage in clinical risk prediction models remain infrequent, sporadic and rarely empirically evaluated.
Objective: We seek to assess the uptake of fairness metrics in clinical risk prediction modeling through an empirical evaluation of popular prediction models for two diseases, one chronic and one infectious disease.
Methods: We conducted a scoping literature review in November 2023 of recent high-impact publications on clinical risk prediction models for cardiovascular disease (CVD) and COVID-19 using Google Scholar.
Results: Our review resulted in a shortlist of 23 CVD-focused articles and 22 COVID-19 focused articles. No articles evaluated fairness metrics. Of the CVD articles, 26% used a sex-stratified model, and of those with race/ethnicity data, 92% had data from over 50% from one race/ethnicity. Of the COVID-19 models, 9% used a sex-stratified model, and of those that included race/ethnicity data, 50% had study populations that were more than 50% from one race/ethnicity. No articles for either disease stratified their models by race/ethnicity.
Conclusions: Our review shows that the use of fairness metrics for evaluating differences across sensitive features is rare, despite their ability to identify inequality and flag potential gaps in prevention and care. We also find that training data remain largely racially/ethnically homogeneous, demonstrating an urgent need for diversifying study cohorts and data collection. We propose an implementation framework to initiate change, calling for better connections between theory and practice when it comes to adoption of fairness metrics for clinical risk prediction. We hypothesize that this integration will lead to a more equitable prediction world.