用R提取和整理推文的指南

J. Adams, Carlos Augusto Jardim Chiarelli
{"title":"用R提取和整理推文的指南","authors":"J. Adams, Carlos Augusto Jardim Chiarelli","doi":"10.25189/2675-4916.2021.v2.n4.id410","DOIUrl":null,"url":null,"abstract":"Social media platforms represent a deep resource for academic research and a wide range of untapped possibilities for linguists (D'ARCY; YOUNG, 2012). This rapidly developing field presents various ethical issues and unique challenges regarding methods to retrieve and analyze data. This tutorial provides a straightforward guide to harvesting and tidying Twitter data, focused mainly on the Tweets' text, by using the R programming language (R CORE TEAM, 2020) via Twitter's APIs. The R code was developed in Adams (2020), based on the rtweet package (KEARNEY, 2018), and successfully resulted in a script for corpora compilation. In this tutorial, we discuss limitations, problems, and solutions in our framework for conducting ethical research on this social networking site. Our ethical concerns go beyond what we \"agree to\" in terms of use and privacy policies, that is, we argue that their content does not contemplate all the concerns researchers need to attend to. Additionally, our aim is to show that using Twitter as a data source does not require advanced computational skills.","PeriodicalId":137098,"journal":{"name":"Cadernos de Linguística","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A guide on extracting and tidying tweets with R\",\"authors\":\"J. Adams, Carlos Augusto Jardim Chiarelli\",\"doi\":\"10.25189/2675-4916.2021.v2.n4.id410\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Social media platforms represent a deep resource for academic research and a wide range of untapped possibilities for linguists (D'ARCY; YOUNG, 2012). This rapidly developing field presents various ethical issues and unique challenges regarding methods to retrieve and analyze data. This tutorial provides a straightforward guide to harvesting and tidying Twitter data, focused mainly on the Tweets' text, by using the R programming language (R CORE TEAM, 2020) via Twitter's APIs. The R code was developed in Adams (2020), based on the rtweet package (KEARNEY, 2018), and successfully resulted in a script for corpora compilation. In this tutorial, we discuss limitations, problems, and solutions in our framework for conducting ethical research on this social networking site. Our ethical concerns go beyond what we \\\"agree to\\\" in terms of use and privacy policies, that is, we argue that their content does not contemplate all the concerns researchers need to attend to. Additionally, our aim is to show that using Twitter as a data source does not require advanced computational skills.\",\"PeriodicalId\":137098,\"journal\":{\"name\":\"Cadernos de Linguística\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cadernos de Linguística\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.25189/2675-4916.2021.v2.n4.id410\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cadernos de Linguística","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.25189/2675-4916.2021.v2.n4.id410","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

社交媒体平台为学术研究提供了深厚的资源,为语言学家提供了广泛的未开发的可能性(D'ARCY;年轻,2012)。这个快速发展的领域提出了各种各样的伦理问题和关于检索和分析数据的方法的独特挑战。本教程提供了一个简单的指南,通过Twitter的api使用R编程语言(R CORE TEAM, 2020)来收集和整理Twitter数据,主要关注Twitter的文本。R代码是在Adams(2020)中基于rtweet包(KEARNEY, 2018)开发的,并成功生成了用于语料库编译的脚本。在本教程中,我们将讨论在这个社交网站上进行伦理研究的框架中的限制、问题和解决方案。我们的伦理问题超出了我们在使用和隐私政策方面“同意”的范围,也就是说,我们认为他们的内容没有考虑到研究人员需要关注的所有问题。此外,我们的目的是表明使用Twitter作为数据源并不需要高级的计算技能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A guide on extracting and tidying tweets with R
Social media platforms represent a deep resource for academic research and a wide range of untapped possibilities for linguists (D'ARCY; YOUNG, 2012). This rapidly developing field presents various ethical issues and unique challenges regarding methods to retrieve and analyze data. This tutorial provides a straightforward guide to harvesting and tidying Twitter data, focused mainly on the Tweets' text, by using the R programming language (R CORE TEAM, 2020) via Twitter's APIs. The R code was developed in Adams (2020), based on the rtweet package (KEARNEY, 2018), and successfully resulted in a script for corpora compilation. In this tutorial, we discuss limitations, problems, and solutions in our framework for conducting ethical research on this social networking site. Our ethical concerns go beyond what we "agree to" in terms of use and privacy policies, that is, we argue that their content does not contemplate all the concerns researchers need to attend to. Additionally, our aim is to show that using Twitter as a data source does not require advanced computational skills.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信