Detecting Social Media Icebergs by Their Tips: Rumors, Persuasion Campaigns, and Information Needs

Zhe Zhao
{"title":"Detecting Social Media Icebergs by Their Tips: Rumors, Persuasion Campaigns, and Information Needs","authors":"Zhe Zhao","doi":"10.1145/2835776.2855086","DOIUrl":null,"url":null,"abstract":"Online activities of more than one billion social media users all over the world form a resourceful ocean of data. Many social media mining techniques try to explore this ocean and extract different types of resources. In this thesis, we present a framework that can detect different types of meaningful social media phenomena. They usually can be viewed as a group of online activities from many social media users with a common or similar objective, such as spreading of rumors, bursting information needs on events and products, or asking for support of an action. These different types of social media phenomena are relatively rare but can be very influential. Detecting them is challenging according to its characteristics. Each phenomenon contains a collection of activities that usually take variety of forms. Taking the spreading of rumor in social media as an example, one rumor may be spread in different forms of statements and expressions. And it can be very hard to distinguish them from statements from trustful sources. Existing work of detecting different types of social media phenomena usually adopts classifiers trained on features of a single activity or cluster of activities [1]. However, the features from single activity are not sufficient for many detection tasks. And the features from cluster of activities will not be significant until that cluster becomes large enough, which cannot be used in early stage detection . In this thesis, we propose to detect meaningful social media phenomena by signal user behaviors observed at an early stage. Just like spotting icebergs in the ocean by their tips, in our case, the tip of a social media iceberg is a small proportion of activities that exist only in social media icebergs. And they can be found even at the early stage. Therefore, we design our detection framework to first detect these specific signal activities. Then we will use them to understand the characteristic of the entire collection of activities from social media phenomena . What we learned can be used to train accurate classifiers to identify whether a collection of activities containing signal activities is a target social media phenomenon or not. This framework is generic and can be applied on detecting many different types of collective activities in social media. We apply our framework on detecting three types of meaningful soPermission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). WSDM 2016 February 22-25, 2016, San Francisco, CA, USA c © 2016 Copyright held by the owner/author(s). ACM ISBN 978-1-4503-3716-8/16/02. DOI: http://dx.doi.org/10.1145/2835776.2855086 cial media phenomena, i.e., emerging information needs, trending rumors, and persuasion campaigns. To detect emerging information needs, we train a classifier to detect user asking question behaviors as signals. We analyze all the questions detected by this classifier and extract keywords from their content to identify emerging information needs. We find out that as signal activities, the questions being asked are substantially different from other types of activities. The keywords extracted from those questions have a considerable power of predicting the trends of Google queries[2]. In our work of detecting trending rumors[3], we find that when there is a rumor, even though most posts do not raise questions about it, there may be a few that do. These questions suspecting whether a piece of information is true or not can help us identify controversial and unconfirmed statements, such as social media rumors. Therefore, we adopt this type enquiry activities as signal to detect rumors. Experiment results show that our rumor detection approach can detect social media rumors at early stage effectively and efficiently. At last, we propose to apply and improve our framework to detect another very important type of social media phenomena, i.e., persuasion campaigns. We will first study and provide a formal definition of social media persuasion campaigns. Then we will implement our detection framework and experiment it with different signal activities. We also propose to develop an algorithm to discover activities opposing the detected persuasion campaigns. We will conduct experiments on Twitter to check the effectiveness and efficiency of our method.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"83 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-02-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2835776.2855086","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Online activities of more than one billion social media users all over the world form a resourceful ocean of data. Many social media mining techniques try to explore this ocean and extract different types of resources. In this thesis, we present a framework that can detect different types of meaningful social media phenomena. They usually can be viewed as a group of online activities from many social media users with a common or similar objective, such as spreading of rumors, bursting information needs on events and products, or asking for support of an action. These different types of social media phenomena are relatively rare but can be very influential. Detecting them is challenging according to its characteristics. Each phenomenon contains a collection of activities that usually take variety of forms. Taking the spreading of rumor in social media as an example, one rumor may be spread in different forms of statements and expressions. And it can be very hard to distinguish them from statements from trustful sources. Existing work of detecting different types of social media phenomena usually adopts classifiers trained on features of a single activity or cluster of activities [1]. However, the features from single activity are not sufficient for many detection tasks. And the features from cluster of activities will not be significant until that cluster becomes large enough, which cannot be used in early stage detection . In this thesis, we propose to detect meaningful social media phenomena by signal user behaviors observed at an early stage. Just like spotting icebergs in the ocean by their tips, in our case, the tip of a social media iceberg is a small proportion of activities that exist only in social media icebergs. And they can be found even at the early stage. Therefore, we design our detection framework to first detect these specific signal activities. Then we will use them to understand the characteristic of the entire collection of activities from social media phenomena . What we learned can be used to train accurate classifiers to identify whether a collection of activities containing signal activities is a target social media phenomenon or not. This framework is generic and can be applied on detecting many different types of collective activities in social media. We apply our framework on detecting three types of meaningful soPermission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s). WSDM 2016 February 22-25, 2016, San Francisco, CA, USA c © 2016 Copyright held by the owner/author(s). ACM ISBN 978-1-4503-3716-8/16/02. DOI: http://dx.doi.org/10.1145/2835776.2855086 cial media phenomena, i.e., emerging information needs, trending rumors, and persuasion campaigns. To detect emerging information needs, we train a classifier to detect user asking question behaviors as signals. We analyze all the questions detected by this classifier and extract keywords from their content to identify emerging information needs. We find out that as signal activities, the questions being asked are substantially different from other types of activities. The keywords extracted from those questions have a considerable power of predicting the trends of Google queries[2]. In our work of detecting trending rumors[3], we find that when there is a rumor, even though most posts do not raise questions about it, there may be a few that do. These questions suspecting whether a piece of information is true or not can help us identify controversial and unconfirmed statements, such as social media rumors. Therefore, we adopt this type enquiry activities as signal to detect rumors. Experiment results show that our rumor detection approach can detect social media rumors at early stage effectively and efficiently. At last, we propose to apply and improve our framework to detect another very important type of social media phenomena, i.e., persuasion campaigns. We will first study and provide a formal definition of social media persuasion campaigns. Then we will implement our detection framework and experiment it with different signal activities. We also propose to develop an algorithm to discover activities opposing the detected persuasion campaigns. We will conduct experiments on Twitter to check the effectiveness and efficiency of our method.
通过他们的提示检测社交媒体冰山:谣言,说服活动和信息需求
全球超过10亿社交媒体用户的在线活动形成了一个资源丰富的数据海洋。许多社交媒体挖掘技术试图探索这片海洋,提取不同类型的资源。在本文中,我们提出了一个可以检测不同类型有意义的社交媒体现象的框架。它们通常可以被看作是许多社交媒体用户为了共同或相似的目的而进行的一组在线活动,例如传播谣言,爆发对事件和产品的信息需求,或要求支持某项行动。这些不同类型的社交媒体现象相对罕见,但可能非常有影响力。根据它们的特点,探测它们是具有挑战性的。每种现象都包含一系列活动,通常采取各种形式。以谣言在社交媒体上的传播为例,一个谣言可能会以不同的陈述和表达形式传播。而且很难将它们与可信来源的陈述区分开来。现有的检测不同类型社交媒体现象的工作通常采用对单个活动或活动簇[1]的特征进行训练的分类器。然而,单个活动的特征对于许多检测任务来说是不够的。而活动集群的特征在集群变得足够大之前是不重要的,这不能用于早期检测。在本文中,我们建议通过早期观察到的信号用户行为来发现有意义的社交媒体现象。就像在海洋中发现冰山一样,在我们的例子中,社交媒体冰山的一角是只存在于社交媒体冰山中的一小部分活动。甚至在早期阶段就能发现。因此,我们设计了检测框架,首先检测这些特定的信号活动。然后我们将利用它们从社交媒体现象中了解整个活动集合的特征。我们所学到的可以用来训练准确的分类器,以识别包含信号活动的活动集合是否是目标社交媒体现象。这个框架是通用的,可以用于检测社交媒体中许多不同类型的集体活动。我们应用我们的框架来检测三种类型的有意义的许可,允许免费制作部分或全部作品的数字或硬拷贝供个人或课堂使用,前提是副本不是为了盈利或商业利益而制作或分发的,并且副本在第一页上带有本通知和完整的引用。本作品的第三方组件的版权必须得到尊重。对于所有其他用途,请联系所有者/作者。WSDM 2016 2016年2月22-25日,旧金山,CA, USA c©2016版权归所有人/作者所有。Acm isbn 978-1-4503-3716-8/16/02。DOI: http://dx.doi.org/10.1145/2835776.2855086社交媒体现象,即新兴的信息需求、流行谣言和说服活动。为了检测新出现的信息需求,我们训练了一个分类器来检测用户提问行为作为信号。我们对该分类器检测到的所有问题进行分析,并从其内容中提取关键字,以识别新出现的信息需求。我们发现,作为信号活动,所提出的问题与其他类型的活动有本质上的不同。从这些问题中提取的关键字具有相当大的预测谷歌查询[2]趋势的能力。在我们检测趋势谣言[3]的工作中,我们发现,当有谣言时,即使大多数帖子没有提出质疑,也可能有少数帖子提出质疑。这些质疑一条信息是否真实的问题可以帮助我们识别有争议的和未经证实的言论,比如社交媒体上的谣言。因此,我们采用这种类型的询问活动作为检测谣言的信号。实验结果表明,我们的谣言检测方法能够有效地在社交媒体谣言的早期阶段进行检测。最后,我们建议应用并改进我们的框架来检测另一种非常重要的社交媒体现象,即说服活动。我们将首先研究并提供社交媒体说服活动的正式定义。然后,我们将实现我们的检测框架,并对不同的信号活动进行实验。我们还建议开发一种算法来发现与检测到的说服运动相反的活动。我们将在Twitter上进行实验来检验我们方法的有效性和效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信