Junming Shi, Alan E Hubbard, Nicholas Fong, Romain Pirracchio
{"title":"Implicit bias in ICU electronic health record data: measurement frequencies and missing data rates of clinical variables.","authors":"Junming Shi, Alan E Hubbard, Nicholas Fong, Romain Pirracchio","doi":"10.1186/s12911-025-03058-9","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Systematic disparities in data collection within electronic health records (EHRs), defined as non-random patterns in the measurement and recording of clinical variables across demographic groups, can be reflective of underlying implicit bias and may affect patient outcome. Identifying and mitigating these biases is critical for ensuring equitable healthcare. This study aims to develop an analytical framework for measurement patterns, defined as the combination of measurement frequency (how often variables are collected) and missing data rates (the frequency of missing recordings), evaluate the association between them and demographic factors, and assess their impact on in-hospital mortality prediction.</p><p><strong>Methods: </strong>We conducted a retrospective cohort study using the Medical Information Mart for Intensive Care III (MIMIC-III) database, which includes data on over 40,000 ICU patients from Beth Israel Deaconess Medical Center (2001-2012). Adult patients with ICU stays longer than 24 h were included. Measurement patterns, including missing data rates and measurement frequencies, were derived from EHR data and analyzed. Targeted Machine Learning (TML) methods were used to assess potential systematic disparities in measurement patterns across demographic factors (age, gender, race/ethnicity) while controlling for confounders such as other demographics and disease severity. The predictive power of measurement patterns on in-hospital mortality was evaluated.</p><p><strong>Results: </strong>Among 23,426 patients, significant demographic systematic disparities were observed in the first 24 h of ICU stays. Elderly patients (≥ 65 years) had more frequent temperature measurements compared to younger patients, while males had slightly fewer missing temperature measurements than females. Racial disparities were notable: White patients had more frequent blood pressure and oxygen saturation (SpO2) measurements compared to Black and Hispanic patients. Measurement patterns were associated with ICU mortality, with models based solely on these patterns achieving an area under the receiver operating characteristic curve (AUC) of 0.76 (95% CI: 0.74-0.77).</p><p><strong>Conclusions: </strong>This study underscores the significance of measurement patterns in ICU EHR data, which are associated with patient demographics and ICU mortality. Analyzing patterns of missing data and measurement frequencies provides valuable insights into patient monitoring practices and potential systemic disparities in healthcare delivery. Understanding these disparities is critical for improving the fairness of healthcare delivery and developing more accurate predictive models in critical care settings.</p><p><strong>Clinical trial number: </strong>Not applicable.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"241"},"PeriodicalIF":3.8000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12220764/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03058-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Systematic disparities in data collection within electronic health records (EHRs), defined as non-random patterns in the measurement and recording of clinical variables across demographic groups, can be reflective of underlying implicit bias and may affect patient outcome. Identifying and mitigating these biases is critical for ensuring equitable healthcare. This study aims to develop an analytical framework for measurement patterns, defined as the combination of measurement frequency (how often variables are collected) and missing data rates (the frequency of missing recordings), evaluate the association between them and demographic factors, and assess their impact on in-hospital mortality prediction.
Methods: We conducted a retrospective cohort study using the Medical Information Mart for Intensive Care III (MIMIC-III) database, which includes data on over 40,000 ICU patients from Beth Israel Deaconess Medical Center (2001-2012). Adult patients with ICU stays longer than 24 h were included. Measurement patterns, including missing data rates and measurement frequencies, were derived from EHR data and analyzed. Targeted Machine Learning (TML) methods were used to assess potential systematic disparities in measurement patterns across demographic factors (age, gender, race/ethnicity) while controlling for confounders such as other demographics and disease severity. The predictive power of measurement patterns on in-hospital mortality was evaluated.
Results: Among 23,426 patients, significant demographic systematic disparities were observed in the first 24 h of ICU stays. Elderly patients (≥ 65 years) had more frequent temperature measurements compared to younger patients, while males had slightly fewer missing temperature measurements than females. Racial disparities were notable: White patients had more frequent blood pressure and oxygen saturation (SpO2) measurements compared to Black and Hispanic patients. Measurement patterns were associated with ICU mortality, with models based solely on these patterns achieving an area under the receiver operating characteristic curve (AUC) of 0.76 (95% CI: 0.74-0.77).
Conclusions: This study underscores the significance of measurement patterns in ICU EHR data, which are associated with patient demographics and ICU mortality. Analyzing patterns of missing data and measurement frequencies provides valuable insights into patient monitoring practices and potential systemic disparities in healthcare delivery. Understanding these disparities is critical for improving the fairness of healthcare delivery and developing more accurate predictive models in critical care settings.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.