{"title":"Vulgarity in online discourse around the English-speaking world","authors":"Martin Schweinberger , Kate Burridge","doi":"10.1016/j.lingua.2025.103946","DOIUrl":null,"url":null,"abstract":"<div><div>This paper takes a corpus-based approach to study vulgar language in online communication across 20 English-speaking regions based on the Global Web-Based English Corpus (GloWbE). The identification of vulgarity combines word lists used in profanity detection with regular expressions to identify a wide range of vulgar elements including spelling variants and obscured forms. The results show a notable trend for inner circle L1-varieties to exhibit higher rates of vulgarity online compared to outer circle and L2-varieties. The results also show that inner circle varieties have lower adapted corrected type-token rations which indicates that inner circle variety speakers use more varied English vulgar forms compared with speakers from other circle varieties. In addition, there is a general register difference with vulgarity being more common in blog data compared with general web content. Finally, the results show that different regions exhibit preferences for specific vulgar lemmas <em>feck</em> being preferred in Ireland, <em>cunt</em>, in Britain, and <em>ass(hole)</em> in the United States. The findings are interpreted to show that cultural differences are reflected in region-specific preferences for vulgarity and that the creativity observed in inner circle varieties is linked to norm-setting compared to norm-reception associated with outer circle varieties.</div></div>","PeriodicalId":47955,"journal":{"name":"Lingua","volume":"321 ","pages":"Article 103946"},"PeriodicalIF":1.1000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lingua","FirstCategoryId":"98","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0024384125000713","RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 0
Abstract
This paper takes a corpus-based approach to study vulgar language in online communication across 20 English-speaking regions based on the Global Web-Based English Corpus (GloWbE). The identification of vulgarity combines word lists used in profanity detection with regular expressions to identify a wide range of vulgar elements including spelling variants and obscured forms. The results show a notable trend for inner circle L1-varieties to exhibit higher rates of vulgarity online compared to outer circle and L2-varieties. The results also show that inner circle varieties have lower adapted corrected type-token rations which indicates that inner circle variety speakers use more varied English vulgar forms compared with speakers from other circle varieties. In addition, there is a general register difference with vulgarity being more common in blog data compared with general web content. Finally, the results show that different regions exhibit preferences for specific vulgar lemmas feck being preferred in Ireland, cunt, in Britain, and ass(hole) in the United States. The findings are interpreted to show that cultural differences are reflected in region-specific preferences for vulgarity and that the creativity observed in inner circle varieties is linked to norm-setting compared to norm-reception associated with outer circle varieties.
期刊介绍:
Lingua publishes papers of any length, if justified, as well as review articles surveying developments in the various fields of linguistics, and occasional discussions. A considerable number of pages in each issue are devoted to critical book reviews. Lingua also publishes Lingua Franca articles consisting of provocative exchanges expressing strong opinions on central topics in linguistics; The Decade In articles which are educational articles offering the nonspecialist linguist an overview of a given area of study; and Taking up the Gauntlet special issues composed of a set number of papers examining one set of data and exploring whose theory offers the most insight with a minimal set of assumptions and a maximum of arguments.