预测未来:利用基于RNN的特征拼接进行Tweet爆发预测

Saswata Roy, B. Suman, Joydeep Chandra, Sourav Kumar Dandapat
{"title":"预测未来:利用基于RNN的特征拼接进行Tweet爆发预测","authors":"Saswata Roy, B. Suman, Joydeep Chandra, Sourav Kumar Dandapat","doi":"10.1145/3371158.3371190","DOIUrl":null,"url":null,"abstract":"Cascade outbreak is a common phenomenon observed across different social networking platforms. Cascade outbreak might have severe implications in different scenarios like a fake news/rumour can spread across a significant number of people, or a hate news can be propagated, which may incite violence etc. Early prediction of cascade outbreak would help in taking proper remedial action and hence is an important research direction. Most of the existing approaches predicted the popularity of social networking post either by machine learning techniques or using statistical models. Simple machine learning based approaches may miss important features while statistical models use hard-coded functions which might not be suitable in a different scenario. With the availability of huge data, recently deep learning based models have also been applied in the prediction of cascade outbreak. This study identified the limitation of existing deep learning based approaches and proposed a Recurrent Neural Network based Hybrid Model with Feature Concatenation (RNN-HMFC) approach. RNN-HMFC captures important latent features of textual aspect and retweet information respectively by LSTM and GRU and also uses a set of handcrafted features like additional tweet information and user social information for prediction of virality. We achieve 2.7% - 6.45% higher accuracy compared to the state of the art methods on different datasets.","PeriodicalId":360747,"journal":{"name":"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Forecasting the Future: Leveraging RNN based Feature Concatenation for Tweet Outbreak Prediction\",\"authors\":\"Saswata Roy, B. Suman, Joydeep Chandra, Sourav Kumar Dandapat\",\"doi\":\"10.1145/3371158.3371190\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cascade outbreak is a common phenomenon observed across different social networking platforms. Cascade outbreak might have severe implications in different scenarios like a fake news/rumour can spread across a significant number of people, or a hate news can be propagated, which may incite violence etc. Early prediction of cascade outbreak would help in taking proper remedial action and hence is an important research direction. Most of the existing approaches predicted the popularity of social networking post either by machine learning techniques or using statistical models. Simple machine learning based approaches may miss important features while statistical models use hard-coded functions which might not be suitable in a different scenario. With the availability of huge data, recently deep learning based models have also been applied in the prediction of cascade outbreak. This study identified the limitation of existing deep learning based approaches and proposed a Recurrent Neural Network based Hybrid Model with Feature Concatenation (RNN-HMFC) approach. RNN-HMFC captures important latent features of textual aspect and retweet information respectively by LSTM and GRU and also uses a set of handcrafted features like additional tweet information and user social information for prediction of virality. We achieve 2.7% - 6.45% higher accuracy compared to the state of the art methods on different datasets.\",\"PeriodicalId\":360747,\"journal\":{\"name\":\"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3371158.3371190\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3371158.3371190","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

级联爆发是在不同社交网络平台上观察到的常见现象。级联爆发可能会在不同的情况下产生严重影响,例如假新闻/谣言可以在大量人群中传播,或者仇恨新闻可以传播,这可能会煽动暴力等。对级联暴发的早期预测有助于采取适当的补救措施,因此是重要的研究方向。大多数现有的方法要么通过机器学习技术,要么使用统计模型来预测社交网络帖子的流行。简单的基于机器学习的方法可能会错过重要的特征,而统计模型使用的硬编码函数可能不适合不同的场景。随着大数据的可用性,最近基于深度学习的模型也被应用于级联爆发的预测。针对现有基于深度学习方法的局限性,提出了一种基于递归神经网络的特征连接混合模型(RNN-HMFC)方法。RNN-HMFC分别通过LSTM和GRU捕获文本方面和转发信息的重要潜在特征,并使用一组手工制作的特征,如额外的tweet信息和用户社交信息来预测病毒式传播。在不同的数据集上,与目前最先进的方法相比,我们的准确率提高了2.7% - 6.45%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Forecasting the Future: Leveraging RNN based Feature Concatenation for Tweet Outbreak Prediction
Cascade outbreak is a common phenomenon observed across different social networking platforms. Cascade outbreak might have severe implications in different scenarios like a fake news/rumour can spread across a significant number of people, or a hate news can be propagated, which may incite violence etc. Early prediction of cascade outbreak would help in taking proper remedial action and hence is an important research direction. Most of the existing approaches predicted the popularity of social networking post either by machine learning techniques or using statistical models. Simple machine learning based approaches may miss important features while statistical models use hard-coded functions which might not be suitable in a different scenario. With the availability of huge data, recently deep learning based models have also been applied in the prediction of cascade outbreak. This study identified the limitation of existing deep learning based approaches and proposed a Recurrent Neural Network based Hybrid Model with Feature Concatenation (RNN-HMFC) approach. RNN-HMFC captures important latent features of textual aspect and retweet information respectively by LSTM and GRU and also uses a set of handcrafted features like additional tweet information and user social information for prediction of virality. We achieve 2.7% - 6.45% higher accuracy compared to the state of the art methods on different datasets.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信