Poisson regression for linguists: A tutorial introduction to modelling count data with brms

IF 2.8 0 LANGUAGE & LINGUISTICS
Bodo Winter, Paul-Christian Bürkner
{"title":"Poisson regression for linguists: A tutorial introduction to modelling count data with brms","authors":"Bodo Winter,&nbsp;Paul-Christian Bürkner","doi":"10.1111/lnc3.12439","DOIUrl":null,"url":null,"abstract":"<p>Count data is prevalent in many different areas of linguistics, such as when counting words, syntactic constructions, discourse particles, case markers, or speech errors. The Poisson distribution is the canonical distribution for characterising count data with no or unknown upper bound. Given the prevalence of count data in linguistics, Poisson regression has wide utility no matter what subfield of linguistics is considered. However, in contrast to logistic regression, Poisson regression is surprisingly little known. Here, we make a case for why linguists need to consider Poisson regression, and give recommendations for when Poisson regression is more appropriate compared to logistic regression. This tutorial introduces readers to foundational concepts needed to understand the basics of Poisson regression, followed by a hands-on tutorial using the R package <span>brms</span>. We discuss a dataset where Catalan and Korean speakers change the frequency of their co-speech gestures as a function of politeness contexts. This dataset also involves exposure variables (the incorporation of time to deal with unequal intervals) and overdispersion (excess variance). Altogether, we hope that more linguists will consider Poisson regression for the analysis of count data.</p>","PeriodicalId":47472,"journal":{"name":"Language and Linguistics Compass","volume":"15 11","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2021-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://compass.onlinelibrary.wiley.com/doi/epdf/10.1111/lnc3.12439","citationCount":"26","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Language and Linguistics Compass","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/lnc3.12439","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}
引用次数: 26

Abstract

Count data is prevalent in many different areas of linguistics, such as when counting words, syntactic constructions, discourse particles, case markers, or speech errors. The Poisson distribution is the canonical distribution for characterising count data with no or unknown upper bound. Given the prevalence of count data in linguistics, Poisson regression has wide utility no matter what subfield of linguistics is considered. However, in contrast to logistic regression, Poisson regression is surprisingly little known. Here, we make a case for why linguists need to consider Poisson regression, and give recommendations for when Poisson regression is more appropriate compared to logistic regression. This tutorial introduces readers to foundational concepts needed to understand the basics of Poisson regression, followed by a hands-on tutorial using the R package brms. We discuss a dataset where Catalan and Korean speakers change the frequency of their co-speech gestures as a function of politeness contexts. This dataset also involves exposure variables (the incorporation of time to deal with unequal intervals) and overdispersion (excess variance). Altogether, we hope that more linguists will consider Poisson regression for the analysis of count data.

Abstract Image

语言学家的泊松回归:用brms建模计数数据的教程介绍
计数数据在语言学的许多不同领域都很流行,例如在计数单词、句法结构、语篇小品、格标记或语音错误时。泊松分布是描述没有上界或未知上界的计数数据的典型分布。鉴于计数数据在语言学中的普遍存在,泊松回归无论在语言学的哪个子领域都具有广泛的实用性。然而,与逻辑回归相比,泊松回归令人惊讶地鲜为人知。在这里,我们提出了一个案例,为什么语言学家需要考虑泊松回归,并给出了泊松回归何时比逻辑回归更合适的建议。本教程向读者介绍了理解泊松回归基础所需的基本概念,然后是使用R包brms的动手教程。我们讨论了一个数据集,其中加泰罗尼亚语和韩语使用者改变了他们共同语音手势的频率,作为礼貌上下文的函数。该数据集还涉及暴露变量(合并时间以处理不等间隔)和过度分散(过度方差)。总之,我们希望更多的语言学家将泊松回归用于计数数据的分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Language and Linguistics Compass
Language and Linguistics Compass LANGUAGE & LINGUISTICS-
CiteScore
5.40
自引率
4.00%
发文量
39
期刊介绍: Unique in its range, Language and Linguistics Compass is an online-only journal publishing original, peer-reviewed surveys of current research from across the entire discipline. Language and Linguistics Compass publishes state-of-the-art reviews, supported by a comprehensive bibliography and accessible to an international readership. Language and Linguistics Compass is aimed at senior undergraduates, postgraduates and academics, and will provide a unique reference tool for researching essays, preparing lectures, writing a research proposal, or just keeping up with new developments in a specific area of interest.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信