{"title":"Culture machines","authors":"Rodney H. Jones","doi":"10.1515/applirev-2024-0188","DOIUrl":null,"url":null,"abstract":"This paper discusses the way the concept of culture is discursively constructed by large language models that are trained on massive collections of cultural artefacts and designed to produce probabilistic representations of culture based on this training data. It makes the argument that, no matter how ‘diverse’ their training data is, large language models will always be prone to stereotyping and oversimplification because of the mathematical models that underpin their operations. Efforts to build ‘guardrails’ into systems to reduce their tendency to stereotype can often result in the opposite problem, with issues around culture and ethnicity being ‘invisiblised’. To illustrate this, examples are provided of the stereotypical linguistic styles and cultural attitudes models produce when asked to portray different kinds of ‘persona’. The tendency of large language models to gravitate towards cultural and linguistic generalities is contrasted with trends in intercultural communication towards more fluid, socially situated understandings of interculturality, and implications for the future of cultural representation are discussed.","PeriodicalId":46472,"journal":{"name":"Applied Linguistics Review","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2024-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Linguistics Review","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1515/applirev-2024-0188","RegionNum":2,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0
Abstract
This paper discusses the way the concept of culture is discursively constructed by large language models that are trained on massive collections of cultural artefacts and designed to produce probabilistic representations of culture based on this training data. It makes the argument that, no matter how ‘diverse’ their training data is, large language models will always be prone to stereotyping and oversimplification because of the mathematical models that underpin their operations. Efforts to build ‘guardrails’ into systems to reduce their tendency to stereotype can often result in the opposite problem, with issues around culture and ethnicity being ‘invisiblised’. To illustrate this, examples are provided of the stereotypical linguistic styles and cultural attitudes models produce when asked to portray different kinds of ‘persona’. The tendency of large language models to gravitate towards cultural and linguistic generalities is contrasted with trends in intercultural communication towards more fluid, socially situated understandings of interculturality, and implications for the future of cultural representation are discussed.