A Multi-Dimensional Analysis of English tweets

IF 0.8 3区文学 0 LANGUAGE & LINGUISTICS

Language and Literature Pub Date : 2022-05-01 DOI:10.1177/09639470221090369

Isobelle Clarke

{"title":"A Multi-Dimensional Analysis of English tweets","authors":"Isobelle Clarke","doi":"10.1177/09639470221090369","DOIUrl":null,"url":null,"abstract":"This paper applies Multi-Dimensional Analysis (MDA) to a corpus of English tweets to uncover the most common patterns of linguistic variation. MDA is a commonly applied method in corpus linguistics for the analysis of functional and/or stylistic variation in a particular language variety. Notably, MDA is an approach aimed at identifying and interpreting the frequent patterns of co-occurring linguistic features across a corpus, such as a corpus of spoken and written English registers (Biber, 1988). Traditionally, MDA is based on a factor analysis of the relative frequencies of numerous grammatical features measured across numerous texts drawn from that variety of language to identify a series of underlying dimensions of linguistic variation. Despite its popularity and utility, traditional MDA has an important limitation – it can only be used to analyse texts that are long enough to allow for the relative frequencies of many grammatical forms to be estimated accurately. If the texts under analysis are too short, then few forms can be expected to occur sufficiently frequently for their relative frequency to be accurately estimated. Tweets are characteristically short texts, meaning that traditional MDA cannot be used in the present research. To overcome this problem, this paper introduces a short-text version of MDA and applies it to a corpus of English tweets. Specifically, rather than measure the relative frequencies of forms in each tweet, the approach analyses their occurrence. This binary dataset is then aggregated using Multiple Correspondence Analysis (MCA), which is used much like factor analysis in traditional MDA – to return a series of dimensions that represent the most common patterns of linguistic variation in the dataset. After controlling for text length in the first dimension, four subsequent dimensions are interpreted. The results suggest that there is a great deal of linguistic variation on Twitter. Notably, the results show that Twitter is commonly used for self-commodification, as people manage their identities, engaging in practices of self-branding through stance-taking, self-reporting, promotion and persuasion, as well as broadcasting their message beyond their followership, distributing news and expressing opposition, and this often occurs in order to attract attention. Additionally, the results show that interaction is common, suggesting that Twitter is also used for social and interpersonal gain.","PeriodicalId":45849,"journal":{"name":"Language and Literature","volume":"31 1","pages":"124 - 149"},"PeriodicalIF":0.8000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Language and Literature","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1177/09639470221090369","RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}

引用次数: 4

Abstract

This paper applies Multi-Dimensional Analysis (MDA) to a corpus of English tweets to uncover the most common patterns of linguistic variation. MDA is a commonly applied method in corpus linguistics for the analysis of functional and/or stylistic variation in a particular language variety. Notably, MDA is an approach aimed at identifying and interpreting the frequent patterns of co-occurring linguistic features across a corpus, such as a corpus of spoken and written English registers (Biber, 1988). Traditionally, MDA is based on a factor analysis of the relative frequencies of numerous grammatical features measured across numerous texts drawn from that variety of language to identify a series of underlying dimensions of linguistic variation. Despite its popularity and utility, traditional MDA has an important limitation – it can only be used to analyse texts that are long enough to allow for the relative frequencies of many grammatical forms to be estimated accurately. If the texts under analysis are too short, then few forms can be expected to occur sufficiently frequently for their relative frequency to be accurately estimated. Tweets are characteristically short texts, meaning that traditional MDA cannot be used in the present research. To overcome this problem, this paper introduces a short-text version of MDA and applies it to a corpus of English tweets. Specifically, rather than measure the relative frequencies of forms in each tweet, the approach analyses their occurrence. This binary dataset is then aggregated using Multiple Correspondence Analysis (MCA), which is used much like factor analysis in traditional MDA – to return a series of dimensions that represent the most common patterns of linguistic variation in the dataset. After controlling for text length in the first dimension, four subsequent dimensions are interpreted. The results suggest that there is a great deal of linguistic variation on Twitter. Notably, the results show that Twitter is commonly used for self-commodification, as people manage their identities, engaging in practices of self-branding through stance-taking, self-reporting, promotion and persuasion, as well as broadcasting their message beyond their followership, distributing news and expressing opposition, and this often occurs in order to attract attention. Additionally, the results show that interaction is common, suggesting that Twitter is also used for social and interpersonal gain.

查看原文本刊更多论文

英语推文的多维分析

本文将多维分析（MDA）应用于英语推文语料库，以揭示最常见的语言变异模式。MDA是语料库语言学中常用的一种方法，用于分析特定语言变体中的功能和/或风格变化。值得注意的是，MDA是一种旨在识别和解释语料库中共同出现的语言特征的频繁模式的方法，例如口语和书面英语语域语料库（Biber，1988）。传统上，MDA是基于对从各种语言中提取的大量文本中测量的大量语法特征的相对频率的因子分析，以确定语言变异的一系列潜在维度。尽管传统的MDA很受欢迎和实用，但它有一个重要的局限性——它只能用于分析足够长的文本，以便准确估计许多语法形式的相对频率。如果所分析的文本太短，那么很少有形式能够足够频繁地出现，从而准确估计其相对频率。推文是典型的短文本，这意味着传统的MDA不能用于当前的研究。为了克服这个问题，本文介绍了MDA的短文本版本，并将其应用于英语推文语料库。具体来说，该方法不是测量每条推文中形式的相对频率，而是分析它们的发生情况。然后使用多重对应分析（MCA）对该二进制数据集进行聚合，该分析与传统MDA中的因子分析非常相似，以返回一系列表示数据集中最常见的语言变化模式的维度。在控制了第一个维度中的文本长度之后，将解释随后的四个维度。研究结果表明，推特上存在大量的语言变异。值得注意的是，研究结果表明，推特通常被用于自我商品化，因为人们管理自己的身份，通过采取立场、自我报道、宣传和说服来进行自我品牌实践，以及在粉丝之外传播信息、发布新闻和表达反对意见，而这种情况往往是为了吸引注意力。此外，研究结果显示，互动很常见，这表明推特也被用于社交和人际关系。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Language and Literature Multiple-

CiteScore

1.70

自引率

14.30%

发文量

期刊介绍： Language and Literature is an invaluable international peer-reviewed journal that covers the latest research in stylistics, defined as the study of style in literary and non-literary language. We publish theoretical, empirical and experimental research that aims to make a contribution to our understanding of style and its effects on readers. Topics covered by the journal include (but are not limited to) the following: the stylistic analysis of literary and non-literary texts, cognitive approaches to text comprehension, corpus and computational stylistics, the stylistic investigation of multimodal texts, pedagogical stylistics, the reading process, software development for stylistics, and real-world applications for stylistic analysis. We welcome articles that investigate the relationship between stylistics and other areas of linguistics, such as text linguistics, sociolinguistics and translation studies. We also encourage interdisciplinary submissions that explore the connections between stylistics and such cognate subjects and disciplines as psychology, literary studies, narratology, computer science and neuroscience. Language and Literature is essential reading for academics, teachers and students working in stylistics and related areas of language and literary studies.