Qi Cao, Huawei Shen, Keting Cen, W. Ouyang, Xueqi Cheng
{"title":"DeepHawkes:弥合信息级联预测和理解之间的差距","authors":"Qi Cao, Huawei Shen, Keting Cen, W. Ouyang, Xueqi Cheng","doi":"10.1145/3132847.3132973","DOIUrl":null,"url":null,"abstract":"Online social media remarkably facilitates the production and delivery of information, intensifying the competition among vast information for users' attention and highlighting the importance of predicting the popularity of information. Existing approaches for popularity prediction fall into two paradigms: feature-based approaches and generative approaches. Feature-based approaches extract various features (e.g., user, content, structural, and temporal features), and predict the future popularity of information by training a regression/classification model. Their predictive performance heavily depends on the quality of hand-crafted features. In contrast, generative approaches devote to characterizing and modeling the process that a piece of information accrues attentions, offering us high ease to understand the underlying mechanisms governing the popularity dynamics of information cascades. But they have less desirable predictive power since they are not optimized for popularity prediction. In this paper, we propose DeepHawkes to combat the defects of existing methods, leveraging end-to-end deep learning to make an analogy to interpretable factors of Hawkes process --- a widely-used generative process to model information cascade. DeepHawkes inherits the high interpretability of Hawkes process and possesses the high predictive power of deep learning methods, bridging the gap between prediction and understanding of information cascades. We verify the effectiveness of DeepHawkes by applying it to predict retweet cascades of Sina Weibo and citation cascades of a longitudinal citation dataset. Experimental results demonstrate that DeepHawkes outperforms both feature-based and generative approaches.","PeriodicalId":20449,"journal":{"name":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","volume":"114 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"182","resultStr":"{\"title\":\"DeepHawkes: Bridging the Gap between Prediction and Understanding of Information Cascades\",\"authors\":\"Qi Cao, Huawei Shen, Keting Cen, W. Ouyang, Xueqi Cheng\",\"doi\":\"10.1145/3132847.3132973\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online social media remarkably facilitates the production and delivery of information, intensifying the competition among vast information for users' attention and highlighting the importance of predicting the popularity of information. Existing approaches for popularity prediction fall into two paradigms: feature-based approaches and generative approaches. Feature-based approaches extract various features (e.g., user, content, structural, and temporal features), and predict the future popularity of information by training a regression/classification model. Their predictive performance heavily depends on the quality of hand-crafted features. In contrast, generative approaches devote to characterizing and modeling the process that a piece of information accrues attentions, offering us high ease to understand the underlying mechanisms governing the popularity dynamics of information cascades. But they have less desirable predictive power since they are not optimized for popularity prediction. In this paper, we propose DeepHawkes to combat the defects of existing methods, leveraging end-to-end deep learning to make an analogy to interpretable factors of Hawkes process --- a widely-used generative process to model information cascade. DeepHawkes inherits the high interpretability of Hawkes process and possesses the high predictive power of deep learning methods, bridging the gap between prediction and understanding of information cascades. We verify the effectiveness of DeepHawkes by applying it to predict retweet cascades of Sina Weibo and citation cascades of a longitudinal citation dataset. Experimental results demonstrate that DeepHawkes outperforms both feature-based and generative approaches.\",\"PeriodicalId\":20449,\"journal\":{\"name\":\"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management\",\"volume\":\"114 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"182\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3132847.3132973\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM on Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3132847.3132973","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
DeepHawkes: Bridging the Gap between Prediction and Understanding of Information Cascades
Online social media remarkably facilitates the production and delivery of information, intensifying the competition among vast information for users' attention and highlighting the importance of predicting the popularity of information. Existing approaches for popularity prediction fall into two paradigms: feature-based approaches and generative approaches. Feature-based approaches extract various features (e.g., user, content, structural, and temporal features), and predict the future popularity of information by training a regression/classification model. Their predictive performance heavily depends on the quality of hand-crafted features. In contrast, generative approaches devote to characterizing and modeling the process that a piece of information accrues attentions, offering us high ease to understand the underlying mechanisms governing the popularity dynamics of information cascades. But they have less desirable predictive power since they are not optimized for popularity prediction. In this paper, we propose DeepHawkes to combat the defects of existing methods, leveraging end-to-end deep learning to make an analogy to interpretable factors of Hawkes process --- a widely-used generative process to model information cascade. DeepHawkes inherits the high interpretability of Hawkes process and possesses the high predictive power of deep learning methods, bridging the gap between prediction and understanding of information cascades. We verify the effectiveness of DeepHawkes by applying it to predict retweet cascades of Sina Weibo and citation cascades of a longitudinal citation dataset. Experimental results demonstrate that DeepHawkes outperforms both feature-based and generative approaches.