Fnu Rubaiya, Janet O'Connor, Lyubomir N Kolev, James M Antill, Margaret Iiams, LaNaya A Martin, Chantaezia Z Joseph, Claire Youngblood, Jennifer Almeda-Garrett, Linda E Kelemen
{"title":"Improving racial data equity among minority groups in South Carolina using COVID-19 as an example: application of principal components analysis.","authors":"Fnu Rubaiya, Janet O'Connor, Lyubomir N Kolev, James M Antill, Margaret Iiams, LaNaya A Martin, Chantaezia Z Joseph, Claire Youngblood, Jennifer Almeda-Garrett, Linda E Kelemen","doi":"10.1186/s12963-025-00419-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Data inequity occurs when racial and ethnic groups are aggregated during data collection or reporting despite their differences. To demonstrate racial data equity importance, we re-analyzed South Carolina's (SC) census data and COVID-19 case-rate and death-rate distributions according to age, sex, and new combined single and multiracial categories.</p><p><strong>Methods: </strong>The new combined single and multiracial categories included individuals who identified as a single race alone (such as American Indian or Alaska Native, AI-AN) with those who identified as more than one race (such as AI-AN and White) regardless of Hispanic or Latino heritage. We compared those distributions to the single race categories using the American Community Survey 2018-2022 and COVID-19 case and death surveillance data, 2020-2023, for SC. We used principal components analysis to test for differences in age-sex distributions between single race alone and new combined single and multiracial categories for each race.</p><p><strong>Results: </strong>Compared to the combined single and multiracial categories, single race alone categories lose information, underestimate the population of younger-aged people of AI-AN, Asian, and Native Hawaiian or Other Pacific Islander (NH-OPI) races, and result in COVID-19 case and death rates with extreme values across age groups, particularly for AI-AN and NH-OPI populations. Among AI-AN, certain age groups had different COVID-19 case rate patterns between females and males, but this was explained by race categorization (single race alone vs. combined single and multiracial, P < 0.0001).</p><p><strong>Conclusions: </strong>Combined single and multiracial categories achieve data equity by avoiding data suppression or aggregation of small diverse populations. Differences in COVID-19 case rates across some age groups between females and males may be biased depending on how race is defined. Younger generations are increasingly multiracial and will be underrepresented if only single race categories are used in public health reporting practices.</p>","PeriodicalId":51476,"journal":{"name":"Population Health Metrics","volume":"23 1","pages":"54"},"PeriodicalIF":2.5000,"publicationDate":"2025-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12512368/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Population Health Metrics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12963-025-00419-4","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Data inequity occurs when racial and ethnic groups are aggregated during data collection or reporting despite their differences. To demonstrate racial data equity importance, we re-analyzed South Carolina's (SC) census data and COVID-19 case-rate and death-rate distributions according to age, sex, and new combined single and multiracial categories.
Methods: The new combined single and multiracial categories included individuals who identified as a single race alone (such as American Indian or Alaska Native, AI-AN) with those who identified as more than one race (such as AI-AN and White) regardless of Hispanic or Latino heritage. We compared those distributions to the single race categories using the American Community Survey 2018-2022 and COVID-19 case and death surveillance data, 2020-2023, for SC. We used principal components analysis to test for differences in age-sex distributions between single race alone and new combined single and multiracial categories for each race.
Results: Compared to the combined single and multiracial categories, single race alone categories lose information, underestimate the population of younger-aged people of AI-AN, Asian, and Native Hawaiian or Other Pacific Islander (NH-OPI) races, and result in COVID-19 case and death rates with extreme values across age groups, particularly for AI-AN and NH-OPI populations. Among AI-AN, certain age groups had different COVID-19 case rate patterns between females and males, but this was explained by race categorization (single race alone vs. combined single and multiracial, P < 0.0001).
Conclusions: Combined single and multiracial categories achieve data equity by avoiding data suppression or aggregation of small diverse populations. Differences in COVID-19 case rates across some age groups between females and males may be biased depending on how race is defined. Younger generations are increasingly multiracial and will be underrepresented if only single race categories are used in public health reporting practices.
期刊介绍:
Population Health Metrics aims to advance the science of population health assessment, and welcomes papers relating to concepts, methods, ethics, applications, and summary measures of population health. The journal provides a unique platform for population health researchers to share their findings with the global community. We seek research that addresses the communication of population health measures and policy implications to stakeholders; this includes papers related to burden estimation and risk assessment, and research addressing population health across the full range of development. Population Health Metrics covers a broad range of topics encompassing health state measurement and valuation, summary measures of population health, descriptive epidemiology at the population level, burden of disease and injury analysis, disease and risk factor modeling for populations, and comparative assessment of risks to health at the population level. The journal is also interested in how to use and communicate indicators of population health to reduce disease burden, and the approaches for translating from indicators of population health to health-advancing actions. As a cross-cutting topic of importance, we are particularly interested in inequalities in population health and their measurement.