使用作者身份验证来减少在线社区中的滥用

International Conference on Web and Social Media Pub Date : 2022-05-31 DOI:10.1609/icwsm.v16i1.19359

Janith Weerasinghe, Rhia Singh, R. Greenstadt

{"title":"使用作者身份验证来减少在线社区中的滥用","authors":"Janith Weerasinghe, Rhia Singh, R. Greenstadt","doi":"10.1609/icwsm.v16i1.19359","DOIUrl":null,"url":null,"abstract":"Social media has become an important method for information sharing. This has also created opportunities for bad actors to easily spread disinformation and manipulate public opinion. This paper explores the possibility of applying Authorship Verification on online communities to mitigate abuse by analyzing the writing style of online accounts to identify accounts managed by the same person. We expand on our similarity-based authorship verification approach, previously applied on large fanfictions, and show that it works in open-world settings, shorter documents, and is largely topic-agnostic. Our expanded model can link Reddit accounts based on the writing style of only 40 comments with an AUC of 0.95, and the performance increases to 0.98 given more content. We apply this model on a set of suspicious Reddit accounts associated with the disinformation campaign surrounding the 2016 U.S. presidential election and show that the writing style of these accounts are inconsistent, indicating that each account was likely maintained by multiple individuals. We also apply this model to Reddit user accounts that commented on the WallStreetBets subreddit around the 2021 GameStop short squeeze and show that a number of account pairs share very similar writing styles. We also show that this approach can link accounts across Reddit and Twitter with an AUC of 0.91 even when training data is very limited.","PeriodicalId":175641,"journal":{"name":"International Conference on Web and Social Media","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Using Authorship Verification to Mitigate Abuse in Online Communities\",\"authors\":\"Janith Weerasinghe, Rhia Singh, R. Greenstadt\",\"doi\":\"10.1609/icwsm.v16i1.19359\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Social media has become an important method for information sharing. This has also created opportunities for bad actors to easily spread disinformation and manipulate public opinion. This paper explores the possibility of applying Authorship Verification on online communities to mitigate abuse by analyzing the writing style of online accounts to identify accounts managed by the same person. We expand on our similarity-based authorship verification approach, previously applied on large fanfictions, and show that it works in open-world settings, shorter documents, and is largely topic-agnostic. Our expanded model can link Reddit accounts based on the writing style of only 40 comments with an AUC of 0.95, and the performance increases to 0.98 given more content. We apply this model on a set of suspicious Reddit accounts associated with the disinformation campaign surrounding the 2016 U.S. presidential election and show that the writing style of these accounts are inconsistent, indicating that each account was likely maintained by multiple individuals. We also apply this model to Reddit user accounts that commented on the WallStreetBets subreddit around the 2021 GameStop short squeeze and show that a number of account pairs share very similar writing styles. We also show that this approach can link accounts across Reddit and Twitter with an AUC of 0.91 even when training data is very limited.\",\"PeriodicalId\":175641,\"journal\":{\"name\":\"International Conference on Web and Social Media\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Web and Social Media\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1609/icwsm.v16i1.19359\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Web and Social Media","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1609/icwsm.v16i1.19359","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

社交媒体已经成为信息分享的重要方式。这也为不良行为者轻易传播虚假信息和操纵公众舆论创造了机会。本文探讨了在网络社区应用作者身份验证的可能性，通过分析网络账户的写作风格来识别由同一个人管理的账户，以减少滥用。我们扩展了基于相似性的作者身份验证方法，该方法以前应用于大型虚构，并表明它适用于开放世界设置、较短的文档，并且在很大程度上是主题不可知论的。我们的扩展模型可以基于仅40条评论的写作风格链接Reddit帐户，AUC为0.95，并且在更多内容的情况下性能提高到0.98。我们将此模型应用于一组与2016年美国总统大选虚假信息活动相关的可疑Reddit账户，并显示这些账户的写作风格不一致，表明每个账户可能由多个个人维护。我们还将这一模型应用于在2021年GameStop短压缩前后在WallStreetBets子Reddit上发表评论的Reddit用户帐户，并显示许多帐户对具有非常相似的写作风格。我们还表明，即使在训练数据非常有限的情况下，这种方法也可以将Reddit和Twitter上的帐户链接起来，AUC为0.91。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Using Authorship Verification to Mitigate Abuse in Online Communities

Social media has become an important method for information sharing. This has also created opportunities for bad actors to easily spread disinformation and manipulate public opinion. This paper explores the possibility of applying Authorship Verification on online communities to mitigate abuse by analyzing the writing style of online accounts to identify accounts managed by the same person. We expand on our similarity-based authorship verification approach, previously applied on large fanfictions, and show that it works in open-world settings, shorter documents, and is largely topic-agnostic. Our expanded model can link Reddit accounts based on the writing style of only 40 comments with an AUC of 0.95, and the performance increases to 0.98 given more content. We apply this model on a set of suspicious Reddit accounts associated with the disinformation campaign surrounding the 2016 U.S. presidential election and show that the writing style of these accounts are inconsistent, indicating that each account was likely maintained by multiple individuals. We also apply this model to Reddit user accounts that commented on the WallStreetBets subreddit around the 2021 GameStop short squeeze and show that a number of account pairs share very similar writing styles. We also show that this approach can link accounts across Reddit and Twitter with an AUC of 0.91 even when training data is very limited.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Conference on Web and Social Media

自引率

0.00%

发文量