{"title":"Evaluating Algorithmic Approaches to Uncover Racial, Ethnic, and Gender Disparities in Scientific Authorship.","authors":"Yimeng Song,Nabarun Dasgupta,Michelle L Bell","doi":"10.2105/ajph.2025.308017","DOIUrl":null,"url":null,"abstract":"To explore the capabilities of race/ethnicity and gender prediction algorithms in uncovering patterns of authorship distribution in scientific paper submissions to a major peer-reviewed scientific journal (AJPH), we analyzed 17 667 manuscript submissions from the United States between 2013 and 2022. We used machine-learning algorithms to predict corresponding authors' race/ethnicity (Asian, Black, Hispanic, White) and gender categories based on name-derived probabilities to compare the predictive performance of these algorithms and their impact on disparity analysis. Predicted White authors dominated submissions and had the highest acceptance rates (21.1%), while predicted Asian authors faced the lowest (14.9%). Predicted women, despite being the majority, had lower acceptance rates (17.9%) than men (20.5%), a trend consistent across most racial/ethnic groups. Different algorithms revealed similar disparities but were limited by biases and inaccuracies in predicting race and ethnicity. Manuscript acceptance rates revealed disparities by race/ethnicity and gender; predicted White and male authors had the highest rates. While machine-learning algorithms can identify such patterns, their limitations necessitate combining them with self-identified demographic data for greater accuracy. (Am J Public Health. Published online ahead of print May 8, 2025:e1-e8. https://doi.org/10.2105/AJPH.2025.308017).","PeriodicalId":7647,"journal":{"name":"American journal of public health","volume":"110 1","pages":"e1-e8"},"PeriodicalIF":9.6000,"publicationDate":"2025-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of public health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2105/ajph.2025.308017","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
To explore the capabilities of race/ethnicity and gender prediction algorithms in uncovering patterns of authorship distribution in scientific paper submissions to a major peer-reviewed scientific journal (AJPH), we analyzed 17 667 manuscript submissions from the United States between 2013 and 2022. We used machine-learning algorithms to predict corresponding authors' race/ethnicity (Asian, Black, Hispanic, White) and gender categories based on name-derived probabilities to compare the predictive performance of these algorithms and their impact on disparity analysis. Predicted White authors dominated submissions and had the highest acceptance rates (21.1%), while predicted Asian authors faced the lowest (14.9%). Predicted women, despite being the majority, had lower acceptance rates (17.9%) than men (20.5%), a trend consistent across most racial/ethnic groups. Different algorithms revealed similar disparities but were limited by biases and inaccuracies in predicting race and ethnicity. Manuscript acceptance rates revealed disparities by race/ethnicity and gender; predicted White and male authors had the highest rates. While machine-learning algorithms can identify such patterns, their limitations necessitate combining them with self-identified demographic data for greater accuracy. (Am J Public Health. Published online ahead of print May 8, 2025:e1-e8. https://doi.org/10.2105/AJPH.2025.308017).
期刊介绍:
The American Journal of Public Health (AJPH) is dedicated to publishing original work in research, research methods, and program evaluation within the field of public health. The journal's mission is to advance public health research, policy, practice, and education.