Rui Yan, Cheng-te Li, Hsun-Ping Hsieh, P. Hu, Xiaohua Hu, Tingting He
{"title":"Socialized Language Model Smoothing via Bi-directional Influence Propagation on Social Networks","authors":"Rui Yan, Cheng-te Li, Hsun-Ping Hsieh, P. Hu, Xiaohua Hu, Tingting He","doi":"10.1145/2872427.2874811","DOIUrl":null,"url":null,"abstract":"In recent years, online social networks are among the most popular websites with high PV (Page View) all over the world, as they have renewed the way for information discovery and distribution. Millions of users have registered on these websites and hence generate formidable amount of user-generated contents every day. The social networks become \"giants\", likely eligible to carry on any research tasks. However, we have pointed out that these giants still suffer from their \"Achilles Heel\", i.e., extreme sparsity. Compared with the extremely large data over the whole collection, individual posting documents such as microblogs seem to be too sparse to make a difference under various research scenarios, while actually these postings are different. In this paper we propose to tackle the Achilles Heel of social networks by smoothing the language model via influence propagation. To further our previously proposed work to tackle the sparsity issue, we extend the socialized language model smoothing with bi-directional influence learned from propagation. Intuitively, it is insufficient not to distinguish the influence propagated between information source and target without directions. Hence, we formulate a bi-directional socialized factor graph model, which utilizes both the textual correlations between document pairs and the socialized augmentation networks behind the documents, such as user relationships and social interactions. These factors are modeled as attributes and dependencies among documents and their corresponding users, and then are distinguished on the direction level. We propose an effective learning algorithm to learn the proposed factor graph model with directions. Finally we propagate term counts to smooth documents based on the estimated influence. We run experiments on two instinctive datasets of Twitter and Weibo. The results validate the effectiveness of the proposed model. By incorporating direction information into the socialized language model smoothing, our approach obtains improvement over several alternative methods on both intrinsic and extrinsic evaluations measured in terms of perplexity, nDCG and MAP measurements.","PeriodicalId":20455,"journal":{"name":"Proceedings of the 25th International Conference on World Wide Web","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th International Conference on World Wide Web","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2872427.2874811","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
In recent years, online social networks are among the most popular websites with high PV (Page View) all over the world, as they have renewed the way for information discovery and distribution. Millions of users have registered on these websites and hence generate formidable amount of user-generated contents every day. The social networks become "giants", likely eligible to carry on any research tasks. However, we have pointed out that these giants still suffer from their "Achilles Heel", i.e., extreme sparsity. Compared with the extremely large data over the whole collection, individual posting documents such as microblogs seem to be too sparse to make a difference under various research scenarios, while actually these postings are different. In this paper we propose to tackle the Achilles Heel of social networks by smoothing the language model via influence propagation. To further our previously proposed work to tackle the sparsity issue, we extend the socialized language model smoothing with bi-directional influence learned from propagation. Intuitively, it is insufficient not to distinguish the influence propagated between information source and target without directions. Hence, we formulate a bi-directional socialized factor graph model, which utilizes both the textual correlations between document pairs and the socialized augmentation networks behind the documents, such as user relationships and social interactions. These factors are modeled as attributes and dependencies among documents and their corresponding users, and then are distinguished on the direction level. We propose an effective learning algorithm to learn the proposed factor graph model with directions. Finally we propagate term counts to smooth documents based on the estimated influence. We run experiments on two instinctive datasets of Twitter and Weibo. The results validate the effectiveness of the proposed model. By incorporating direction information into the socialized language model smoothing, our approach obtains improvement over several alternative methods on both intrinsic and extrinsic evaluations measured in terms of perplexity, nDCG and MAP measurements.