{"title":"Learning the information diffusion probabilities by using variance regularized EM algorithm","authors":"Hai-Guang Li, Tianyu Cao, Zhao Li","doi":"10.1109/ASONAM.2014.6921596","DOIUrl":null,"url":null,"abstract":"In this paper we address the problem of learning the information diffusion probabilities when there is no sufficient data of information diffusion. By observing the information diffusion behavior on the popular social network web-site Twitter, we find that the evidence of information diffusion is extremely sparse. Less than one percent of tweets are retweeted, which is considered as the most important form of information diffusion evidence on Twitter. Previous research on predicting information diffusion probabilities has failed under such scenarios because the problem of over fitting. To overcome this problem, we first propose to use the variance of the diffusion probabilities as a measure of model complexity for the independent cascade model. After that, we propose two regularization schemes to reduce model complexity. The first scheme is based on regularizing the variance of the diffusion probabilities directly. The second scheme is based on regularizing the mean absolute deviation of the logarithm of the diffusion probabilities. We are able to derive an approximation solution for the first scheme and analytical solution to the second scheme. We conduct experiments by simulating information diffusion on six social network datasets. Experimental results show that the variance regularization scheme outperforms the baseline by a noticeable margin. The mean absolute deviation regularization scheme is better than the baseline.","PeriodicalId":143584,"journal":{"name":"2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASONAM.2014.6921596","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
In this paper we address the problem of learning the information diffusion probabilities when there is no sufficient data of information diffusion. By observing the information diffusion behavior on the popular social network web-site Twitter, we find that the evidence of information diffusion is extremely sparse. Less than one percent of tweets are retweeted, which is considered as the most important form of information diffusion evidence on Twitter. Previous research on predicting information diffusion probabilities has failed under such scenarios because the problem of over fitting. To overcome this problem, we first propose to use the variance of the diffusion probabilities as a measure of model complexity for the independent cascade model. After that, we propose two regularization schemes to reduce model complexity. The first scheme is based on regularizing the variance of the diffusion probabilities directly. The second scheme is based on regularizing the mean absolute deviation of the logarithm of the diffusion probabilities. We are able to derive an approximation solution for the first scheme and analytical solution to the second scheme. We conduct experiments by simulating information diffusion on six social network datasets. Experimental results show that the variance regularization scheme outperforms the baseline by a noticeable margin. The mean absolute deviation regularization scheme is better than the baseline.