{"title":"Detecting changes in content and posting time distributions in social media","authors":"Kazumi Saito, K. Ohara, M. Kimura, H. Motoda","doi":"10.1145/2492517.2492618","DOIUrl":null,"url":null,"abstract":"We address a problem of detecting changes in information posted to social media taking both content and posting time distributions into account. To this end, we introduce a generative model consisting of two components, one for a content distribution and the other for a timing distribution, approximating the shape of the parameter change by a series of step functions. We then propose an efficient algorithm to detect change points by maximizing the likelihood of generating the observed sequence data, which has time complexity almost proportional to the length of observed sequence (possible change points). We experimentally evaluate the method on synthetic data streams and demonstrate the importance of considering both distributions to improve the accuracy. We, further, apply our method to real scoring stream data extracted from a Japanese word-of-mouth communication site for cosmetics and show that it can detect change points and the detected parameter change patterns are interpretable through an in-depth investigation of actual reviews.","PeriodicalId":442230,"journal":{"name":"2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013)","volume":"458 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2492517.2492618","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
We address a problem of detecting changes in information posted to social media taking both content and posting time distributions into account. To this end, we introduce a generative model consisting of two components, one for a content distribution and the other for a timing distribution, approximating the shape of the parameter change by a series of step functions. We then propose an efficient algorithm to detect change points by maximizing the likelihood of generating the observed sequence data, which has time complexity almost proportional to the length of observed sequence (possible change points). We experimentally evaluate the method on synthetic data streams and demonstrate the importance of considering both distributions to improve the accuracy. We, further, apply our method to real scoring stream data extracted from a Japanese word-of-mouth communication site for cosmetics and show that it can detect change points and the detected parameter change patterns are interpretable through an in-depth investigation of actual reviews.