Adaptive Evolutionary Filtering in Real-Time Twitter Stream

Feifan Fan, Yansong Feng, Lili Yao, Dongyan Zhao
{"title":"Adaptive Evolutionary Filtering in Real-Time Twitter Stream","authors":"Feifan Fan, Yansong Feng, Lili Yao, Dongyan Zhao","doi":"10.1145/2983323.2983760","DOIUrl":null,"url":null,"abstract":"With the explosive growth of microblogging service, Twitter has become a leading platform consisting of real-time world wide information. Users tend to explore breaking news or general topics in Twitter according to their interests. However, the explosive amount of incoming tweets leads users to information overload. Therefore, filtering interesting tweets based on users' interest profiles from real-time stream can be helpful for users to easily access the relevant and key information hidden among the tweets. On the other hand, real-time twitter stream contains enormous amount of noisy and redundant tweets. Hence, the filtering process should consider previously pushed interesting tweets to provide users with diverse tweets. What's more, different from traditional document summarization methods which focus on static dataset, the twitter stream is dynamic, fast-arriving and large-scale, which means we have to decide whether to filter the coming tweet for users from the real-time stream as early as possible. In this paper, we propose a novel adaptive evolutionary filtering framework to push interesting tweets for users from real-time twitter stream. First, we propose an adaptive evolutionary filtering algorithm to filter interesting tweets from the twitter stream with respect to user interest profiles. And then we utilize the maximal marginal relevance model in fixed time window to estimate the relevance and diversity of potential tweets. Besides, to overcome the enormous number of redundant tweets and characterize the diversity of potential tweets, we propose a hierarchical tweet representation learning model (HTM) to learn the tweet representations dynamically over time. Experiments on large scale real-time twitter stream datasets demonstrate the efficiency and effectiveness of our framework.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2983323.2983760","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

With the explosive growth of microblogging service, Twitter has become a leading platform consisting of real-time world wide information. Users tend to explore breaking news or general topics in Twitter according to their interests. However, the explosive amount of incoming tweets leads users to information overload. Therefore, filtering interesting tweets based on users' interest profiles from real-time stream can be helpful for users to easily access the relevant and key information hidden among the tweets. On the other hand, real-time twitter stream contains enormous amount of noisy and redundant tweets. Hence, the filtering process should consider previously pushed interesting tweets to provide users with diverse tweets. What's more, different from traditional document summarization methods which focus on static dataset, the twitter stream is dynamic, fast-arriving and large-scale, which means we have to decide whether to filter the coming tweet for users from the real-time stream as early as possible. In this paper, we propose a novel adaptive evolutionary filtering framework to push interesting tweets for users from real-time twitter stream. First, we propose an adaptive evolutionary filtering algorithm to filter interesting tweets from the twitter stream with respect to user interest profiles. And then we utilize the maximal marginal relevance model in fixed time window to estimate the relevance and diversity of potential tweets. Besides, to overcome the enormous number of redundant tweets and characterize the diversity of potential tweets, we propose a hierarchical tweet representation learning model (HTM) to learn the tweet representations dynamically over time. Experiments on large scale real-time twitter stream datasets demonstrate the efficiency and effectiveness of our framework.
实时Twitter流中的自适应进化过滤
随着微博服务的爆炸式增长,Twitter已经成为全球实时信息的领先平台。用户倾向于根据自己的兴趣在Twitter上探索突发新闻或一般话题。然而,大量涌入的推文导致用户信息过载。因此,根据用户的兴趣概况从实时流中过滤出感兴趣的推文,有助于用户轻松获取隐藏在推文中的相关信息和关键信息。另一方面,实时推特流包含了大量的嘈杂和冗余的推文。因此,过滤过程应该考虑以前推送过的有趣tweets,为用户提供多样化的tweets。更重要的是,与传统的文档摘要方法关注静态数据集不同,twitter流是动态的、快速到达的、大规模的,这意味着我们必须尽早决定是否从实时流中为用户过滤即将到来的tweet。在本文中,我们提出了一个新的自适应进化过滤框架,从实时推特流中为用户推送有趣的推文。首先,我们提出了一种自适应进化过滤算法,根据用户兴趣概况从推特流中过滤有趣的推文。然后利用固定时间窗的最大边际相关性模型来估计潜在推文的相关性和多样性。此外,为了克服大量冗余推文并表征潜在推文的多样性,我们提出了一种分层推文表示学习模型(HTM)来随时间动态学习推文表示。在大规模实时twitter流数据集上的实验证明了该框架的效率和有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信