Classifying latent user attributes in twitter

SMUC '10 Pub Date : 2010-10-30 DOI:10.1145/1871985.1871993
D. Rao, David Yarowsky, Abhishek Shreevats, Manaswi Gupta
{"title":"Classifying latent user attributes in twitter","authors":"D. Rao, David Yarowsky, Abhishek Shreevats, Manaswi Gupta","doi":"10.1145/1871985.1871993","DOIUrl":null,"url":null,"abstract":"Social media outlets such as Twitter have become an important forum for peer interaction. Thus the ability to classify latent user attributes, including gender, age, regional origin, and political orientation solely from Twitter user language or similar highly informal content has important applications in advertising, personalization, and recommendation. This paper includes a novel investigation of stacked-SVM-based classification algorithms over a rich set of original features, applied to classifying these four user attributes. It also includes extensive analysis of features and approaches that are effective and not effective in classifying user attributes in Twitter-style informal written genres as distinct from the other primarily spoken genres previously studied in the user-property classification literature. Our models, singly and in ensemble, significantly outperform baseline models in all cases. A detailed analysis of model components and features provides an often entertaining insight into distinctive language-usage variation across gender, age, regional origin and political orientation in modern informal communication.","PeriodicalId":244822,"journal":{"name":"SMUC '10","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"686","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SMUC '10","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1871985.1871993","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 686

Abstract

Social media outlets such as Twitter have become an important forum for peer interaction. Thus the ability to classify latent user attributes, including gender, age, regional origin, and political orientation solely from Twitter user language or similar highly informal content has important applications in advertising, personalization, and recommendation. This paper includes a novel investigation of stacked-SVM-based classification algorithms over a rich set of original features, applied to classifying these four user attributes. It also includes extensive analysis of features and approaches that are effective and not effective in classifying user attributes in Twitter-style informal written genres as distinct from the other primarily spoken genres previously studied in the user-property classification literature. Our models, singly and in ensemble, significantly outperform baseline models in all cases. A detailed analysis of model components and features provides an often entertaining insight into distinctive language-usage variation across gender, age, regional origin and political orientation in modern informal communication.
分类twitter中的潜在用户属性
像推特这样的社交媒体已经成为同伴互动的重要论坛。因此,仅从Twitter用户语言或类似的高度非正式内容中分类潜在用户属性(包括性别、年龄、地域来源和政治倾向)的能力在广告、个性化和推荐中具有重要应用。本文对基于堆叠svm的分类算法进行了新颖的研究,该算法基于丰富的原始特征集,用于对这四种用户属性进行分类。它还包括对twitter风格的非正式书面类型中有效和无效的用户属性分类的特征和方法的广泛分析,这些特征和方法与之前在用户属性分类文献中研究的其他主要口语类型不同。我们的模型,无论是单独的还是整体的,在所有情况下都明显优于基线模型。通过对模型组成和特征的详细分析,我们可以了解现代非正式交际中不同性别、年龄、地域和政治倾向的语言使用差异。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信