Chenxi Liu, Jianing Zheng, Yushu Liu, Xi Wang, Yuting Zhang, Qiang Fu, Wenwen Yu, Ting Yu, Wang Jiang, Dan Wang, Chaojie Liu
{"title":"Potential to perpetuate social biases in health care by Chinese large language models: a model evaluation study.","authors":"Chenxi Liu, Jianing Zheng, Yushu Liu, Xi Wang, Yuting Zhang, Qiang Fu, Wenwen Yu, Ting Yu, Wang Jiang, Dan Wang, Chaojie Liu","doi":"10.1186/s12939-025-02581-5","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Large language models (LLMs) may perpetuate or amplify social biases toward patients. We systematically assessed potential biases of three popular Chinese LLMs in clinical application scenarios.</p><p><strong>Methods: </strong>We tested whether Qwen, Erine, and Baichuan encode social biases for patients of different sex, ethnicity, educational attainment, income level, and health insurance status. First, we prompted LLMs to generate clinical cases for medical education (n = 8,289) and compared the distribution of patient characteristics in LLM-generated cases with national distributions in China. Second, New England Journal of Medicine Healer clinical vignettes were used to prompt LLMs to generate differential diagnoses and treatment plans (n = 45,600), with variations analyzed based on sociodemographic characteristics. Third, we prompted LLMs to assess patient needs (n = 51,039) based on clinical cases, revealing any implicit biases toward patients with different characteristics.</p><p><strong>Results: </strong>The three LLMs showed social biases toward patients with different characteristics to varying degrees in medical education, diagnostic and treatment recommendation, and patient needs assessment. These biases were more frequent in relation to sex, ethnicity, income level, and health insurance status, compared to educational attainment. Overall, the three LLMs failed to appropriately model the sociodemographic diversity of medical conditions, consistently over-representing male, high-education and high-income populations. They also showed a higher referral rate, indicating potential refusal to treat patients, for minority ethnic groups and those without insurance or living with low incomes. The three LLMs were more likely to recommend pain medications for males, and considered patients with higher educational attainment, Han ethnicity, higher income, and those with health insurance as having healthier relationships with others.</p><p><strong>Interpretation: </strong>Our findings broaden the scopes of potential biases inherited in LLMs and highlight the urgent need for systematic and continuous assessments of social biases in LLMs in real-world clinical applications.</p>","PeriodicalId":13745,"journal":{"name":"International Journal for Equity in Health","volume":"24 1","pages":"206"},"PeriodicalIF":4.1000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12265265/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal for Equity in Health","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12939-025-02581-5","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Large language models (LLMs) may perpetuate or amplify social biases toward patients. We systematically assessed potential biases of three popular Chinese LLMs in clinical application scenarios.
Methods: We tested whether Qwen, Erine, and Baichuan encode social biases for patients of different sex, ethnicity, educational attainment, income level, and health insurance status. First, we prompted LLMs to generate clinical cases for medical education (n = 8,289) and compared the distribution of patient characteristics in LLM-generated cases with national distributions in China. Second, New England Journal of Medicine Healer clinical vignettes were used to prompt LLMs to generate differential diagnoses and treatment plans (n = 45,600), with variations analyzed based on sociodemographic characteristics. Third, we prompted LLMs to assess patient needs (n = 51,039) based on clinical cases, revealing any implicit biases toward patients with different characteristics.
Results: The three LLMs showed social biases toward patients with different characteristics to varying degrees in medical education, diagnostic and treatment recommendation, and patient needs assessment. These biases were more frequent in relation to sex, ethnicity, income level, and health insurance status, compared to educational attainment. Overall, the three LLMs failed to appropriately model the sociodemographic diversity of medical conditions, consistently over-representing male, high-education and high-income populations. They also showed a higher referral rate, indicating potential refusal to treat patients, for minority ethnic groups and those without insurance or living with low incomes. The three LLMs were more likely to recommend pain medications for males, and considered patients with higher educational attainment, Han ethnicity, higher income, and those with health insurance as having healthier relationships with others.
Interpretation: Our findings broaden the scopes of potential biases inherited in LLMs and highlight the urgent need for systematic and continuous assessments of social biases in LLMs in real-world clinical applications.
期刊介绍:
International Journal for Equity in Health is an Open Access, peer-reviewed, online journal presenting evidence relevant to the search for, and attainment of, equity in health across and within countries. International Journal for Equity in Health aims to improve the understanding of issues that influence the health of populations. This includes the discussion of political, policy-related, economic, social and health services-related influences, particularly with regard to systematic differences in distributions of one or more aspects of health in population groups defined demographically, geographically, or socially.