{"title":"Comparing diversity, negativity, and stereotypes in Chinese-language AI technologies: an investigation of Baidu, Ernie and Qwen.","authors":"Geng Liu, Carlo Alberto Bono, Francesco Pierri","doi":"10.7717/peerj-cs.2694","DOIUrl":null,"url":null,"abstract":"<p><p>Large language models (LLMs) and search engines have the potential to perpetuate biases and stereotypes by amplifying existing prejudices in their training data and algorithmic processes, thereby influencing public perception and decision-making. While most work has focused on Western-centric AI technologies, we examine social biases embedded in prominent Chinese-based commercial tools, the main search engine Baidu and two leading LLMs, Ernie and Qwen. Leveraging a dataset of 240 social groups across 13 categories describing Chinese society, we collect over 30 k views encoded in the aforementioned tools by prompting them to generate candidate words describing these groups. We find that language models exhibit a broader range of embedded views compared to the search engine, although Baidu and Qwen generate negative content more often than Ernie. We also observe a moderate prevalence of stereotypes embedded in the language models, many of which potentially promote offensive or derogatory views. Our work highlights the importance of prioritizing fairness and inclusivity in AI technologies from a global perspective.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e2694"},"PeriodicalIF":3.5000,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11935762/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2694","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Large language models (LLMs) and search engines have the potential to perpetuate biases and stereotypes by amplifying existing prejudices in their training data and algorithmic processes, thereby influencing public perception and decision-making. While most work has focused on Western-centric AI technologies, we examine social biases embedded in prominent Chinese-based commercial tools, the main search engine Baidu and two leading LLMs, Ernie and Qwen. Leveraging a dataset of 240 social groups across 13 categories describing Chinese society, we collect over 30 k views encoded in the aforementioned tools by prompting them to generate candidate words describing these groups. We find that language models exhibit a broader range of embedded views compared to the search engine, although Baidu and Qwen generate negative content more often than Ernie. We also observe a moderate prevalence of stereotypes embedded in the language models, many of which potentially promote offensive or derogatory views. Our work highlights the importance of prioritizing fairness and inclusivity in AI technologies from a global perspective.
期刊介绍:
PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.