{"title":"Improved Triangle Counting in Graph Streams: Power of Multi-Sampling","authors":"Neeraj Kavassery-Parakkat, K. Hanjani, A. Pavan","doi":"10.1109/ASONAM.2018.8508789","DOIUrl":"https://doi.org/10.1109/ASONAM.2018.8508789","url":null,"abstract":"Some of the well known streaming algorithms to estimate number of triangles in a graph stream work as follows: Sample a single triangle with high enough probability and repeat this basic step to obtain a global triangle count. For example, the algorithm due to Buriol et al. (PODS 2006) uniformly at random picks a single vertex v and a single edge e and checks whether the two cross edges that connect $v$ to $e$ appear in the stream. Similarly, the neighborhood sampling algorithm (PVLDB 2013) attempts to sample a triangle by randomly choosing a single vertex v, a single neighbor $u$ of $v$ and waits for a third edge that completes the triangle. In both the algorithms, the basic sampling step is repeated multiple times to obtain an estimate for the global triangle count in the input graph stream. In this work, we propose a multi-sampling variant of these algorithms: In case of Buriol et al's algorithm, instead of randomly choosing a single vertex and edge, randomly sample multiple vertices and multiple edges and collect cross edges that connect sampled vertices to the sampled edges. In case of neighborhood sampling algorithm, randomly pick multiple edges and pick multiple neighbors of them. We provide a theoretical analysis of these algorithms and prove that these new algorithms improve upon the known space and accuracy bounds. We experimentally show that these algorithms outperform well known triangle counting streaming algorithms.","PeriodicalId":135949,"journal":{"name":"2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127185241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"One Size Does Not Fit All: Predicting Product Returns in E-Commerce Platforms","authors":"Tanuj Joshi, Animesh Mukherjee, Girish Ippadi","doi":"10.1109/ASONAM.2018.8508486","DOIUrl":"https://doi.org/10.1109/ASONAM.2018.8508486","url":null,"abstract":"Providing easy and hassle-free product returns have become a norm for e-commerce companies. However, this flexibility on the part of the customer causes the respective e-commerce companies to incur heavy losses because of the delivery logistics involved and the eventual lower resale value of the product returned. In this paper, we consider data from one of the leading Indian e-commerce companies and investigate the problem of product returns across different lifestyle verticals. One of the striking observations from our measurements is that most of the returns take place for apparels/garments and the major reason for the return as cited by the customers is the “size/fit” issue. Here we develop, based on past purchase/return data, a model that given a user, a brand and a size of the product can predict whether the user is going to eventually return the product. The methodological novelty of our model is that it combines concepts from network science and machine learning to make the predictions. Across three different major verticals of various sizes, we obtain overall F-score improvements between 10%–25% over a naïve baseline where the clusters are obtained using simple random walk with restarts.","PeriodicalId":135949,"journal":{"name":"2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127195845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Konstantinos F. Xylogiannopoulos, P. Karampelas, R. Alhajj
{"title":"Text Mining for Plagiarism Detection: Multivariate Pattern Detection for Recognition of Text Similarities","authors":"Konstantinos F. Xylogiannopoulos, P. Karampelas, R. Alhajj","doi":"10.1109/ASONAM.2018.8508265","DOIUrl":"https://doi.org/10.1109/ASONAM.2018.8508265","url":null,"abstract":"The problem of plagiarism the recent years has been intensified by the availability of information in digital form and the accessibility of the electronic libraries through the Internet. As a result, plagiarism detection has been transformed into a big data analytics problem since the number of digital sources is extravagant and a new document needs to be compared with millions of other existing documents. In this paper, a text mining methodology is proposed that can detect all common patterns between a document and the documents in a reference database. The technique is based on a pattern detection algorithm and the corresponding data structure that enables the algorithm to detect all common patterns. The methodology has been applied in a well-defined dataset providing very promising results identifying difficult cases of plagiarism such as technical disguise.","PeriodicalId":135949,"journal":{"name":"2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123579614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Caimo, A. Bozzon, Alessandro Epasto, Aneesh Sharma, Anirban Dasgupta, Cristina Ioana Muntean, Edgar Meij, Edoardo Serra, F. M. Nardini, Guolei Yang, H. Rabiee, Hongyun Cai, Hongzhi Yin, Huan Liu, H. Rangwala, G. Mason, Ingmar Weber, Ingo Scholtes, Jie Tang
{"title":"ASONAM 2018 Program Committee","authors":"A. Caimo, A. Bozzon, Alessandro Epasto, Aneesh Sharma, Anirban Dasgupta, Cristina Ioana Muntean, Edgar Meij, Edoardo Serra, F. M. Nardini, Guolei Yang, H. Rabiee, Hongyun Cai, Hongzhi Yin, Huan Liu, H. Rangwala, G. Mason, Ingmar Weber, Ingo Scholtes, Jie Tang","doi":"10.1109/asonam.2018.8508303","DOIUrl":"https://doi.org/10.1109/asonam.2018.8508303","url":null,"abstract":"","PeriodicalId":135949,"journal":{"name":"2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116039635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Implicit Entity Linking Through Ad-Hoc Retrieval","authors":"Hawre Hosseini, Tam T. Nguyen, E. Bagheri","doi":"10.1109/ASONAM.2018.8508612","DOIUrl":"https://doi.org/10.1109/ASONAM.2018.8508612","url":null,"abstract":"The systematic linking of explicitly-observed phrases within a document to entities of a knowledge base has already been explored in a process known as entity linking. The objective of this paper, however, is to identify and entity link those entities that are not mentioned but are implied within a document, more specifically within a tweet. This process is referred to as implicit entity linking. Unlike prior work that build a representation for each entity based on its related content in the knowledge base, we propose to perform implicit entity linking by determining how a tweet is related to user-generated content posted online and as such indirectly perform entity linking. We formulate this problem as an ad-hoc document retrieval process where the input query is the tweet, which needs to be implicitly linked and the document space is the set of user-generated content related to the entities of the knowledge base. We systematically compare our work with the state-of-the-art baseline and show that our method is able to provide statistically significant improvements.","PeriodicalId":135949,"journal":{"name":"2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125125216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Weakly Supervised Learning for Fake News Detection on Twitter","authors":"Stefan Helmstetter, Heiko Paulheim","doi":"10.1109/ASONAM.2018.8508520","DOIUrl":"https://doi.org/10.1109/ASONAM.2018.8508520","url":null,"abstract":"The problem of automatic detection of fake news in social media, e.g., on Twitter, has recently drawn some attention. Although, from a technical perspective, it can be regarded as a straight-forward, binary classification problem, the major challenge is the collection of large enough training corpora, since manual annotation of tweets as fake or non-fake news is an expensive and tedious endeavor. In this paper, we discuss a weakly supervised approach, which automatically collects a large-scale, but very noisy training dataset comprising hundreds of thousands of tweets. During collection, we automatically label tweets by their source, i.e., trustworthy or untrustworthy source, and train a classifier on this dataset. We then use that classifier for a different classification target, i.e., the classification of fake and non-fake tweets. Although the labels are not accurate according to the new classification target (not all tweets by an untrustworthy source need to be fake news, and vice versa), we show that despite this unclean inaccurate dataset, it is possible to detect fake news with an F1 score of up to 0.9.","PeriodicalId":135949,"journal":{"name":"2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130555804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Model Bots, not Humans on Social Media","authors":"Nikan Chavoshi, A. Mueen","doi":"10.1109/ASONAM.2018.8508279","DOIUrl":"https://doi.org/10.1109/ASONAM.2018.8508279","url":null,"abstract":"The Posting schedule reveals characteristic patterns of users on social media. Motivated by this knowledge, several researchers have modeled posting schedules and argued that deviation from the model indicates bot or spammer characteristics. It is true that circadian rhythms induce regularity in human posting behavior; however, in this paper, we show that this regularity is an individual trait and insufficient to develop a generic model. More surprisingly, we show that bots are more structured in their posting behaviors compared to humans by using a Convolutional Neural Network (CNN). More precisely, we demonstrate using Class Activation Maps that bots contain less entropy than humans. Thus, we conclude that bots are more amenable to generic models than humans. We evaluate the hypothesis on more than 32 million posts from 12 thousand Twitter users with 97% accuracy.","PeriodicalId":135949,"journal":{"name":"2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116195385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Conesa, D. Gañán, A. Pérez-Navarro, R. Nieto, Gemma Ruiz, F. S. Rubió, B. Sora
{"title":"Positive Cognitive Restructuring Through an App Based on Context Messages","authors":"J. Conesa, D. Gañán, A. Pérez-Navarro, R. Nieto, Gemma Ruiz, F. S. Rubió, B. Sora","doi":"10.1109/ASONAM.2018.8508849","DOIUrl":"https://doi.org/10.1109/ASONAM.2018.8508849","url":null,"abstract":"Chronic pain is a very common problem worldwide and helping people coping with it is fundamental for improving their quality of life. Since smartphones are available anywhere and anytime for all users, the present work proposes the development of an App that helps users to change their mood when facing low back and cervical pain. The App will drive the user thorough several screens that will help him or her to challenge their negative thoughts for more positive ones. This process will be driven thorough some messages and questions proposed by reserachers with expertise on health and pain management, but also thorough messages and questions proposed by users themselves. The main contributions of this work are: 1) using an App to face pain thorugh a process of cognitive restructuring; and 2) sending messages and questions based on the context of the user, by taking into account his or her previous answers, the environment and time of the day.","PeriodicalId":135949,"journal":{"name":"2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116325287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Parallel Community Detection Algorithm Based on Incremental Clustering in Dynamic Network","authors":"Cuiyun Zhang, Yunlei Zhang, Bin Wu","doi":"10.1109/ASONAM.2018.8508730","DOIUrl":"https://doi.org/10.1109/ASONAM.2018.8508730","url":null,"abstract":"Dynamic community detection is a key method for the research of network evolution. However, most existing dynamic community detection algorithms are time-consuming in dealing with large-scale networks. Moreover, most current parallel community detection algorithms are static and they ignore the changes of network structure over time. In this paper, we propose a novel parallel algorithm based on incremental vertices, which is able to process large-scale dynamic networks, called PICD. In PICD algorithm, the revised Parallel Weighted Community Clustering (PWCC) metric is conductive to a convenient calculation, which is more sensitive to community structure compared to other metrics. The PICD approach consists of two main steps. Firstly, it identifies the incremental vertices in the dynamic network. Secondly, it maximizes the PWCC of the entire network by merely adjusting the community membership of incremental vertices to capture community structure in high quality. The results of experiments on both the synthetic and real world networks demonstrate that the PICD algorithm achieves a higher accuracy and efficiency. Moreover, it performs more stable than most of the baseline methods. The experiments also show that PICD algorithm takes an almost linear time with the growth of the network scale.","PeriodicalId":135949,"journal":{"name":"2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116335284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Characterizing Politically Engaged Users' Behavior During the 2016 US Presidential Campaign","authors":"J. Caetano, J. Almeida, H. T. Marques-Neto","doi":"10.1109/ASONAM.2018.8508459","DOIUrl":"https://doi.org/10.1109/ASONAM.2018.8508459","url":null,"abstract":"Political campaigns have frequently used the online social network as an important environment to exhibit the candidate ideas, their activities, and their electoral plans if elected. Some users are more politically engaged than others. As an example, we can observe intense political debates, especially during major campaigns on Twitter. In such context, this paper presents a characterization of politically engaged user groups on Twitter during the 2016 US Presidential Campaign. Using a rich dataset with 23 million tweets, 115 thousand user profiles and their contact network collected from January 2016 to November 2016, we identified four politically engaged user groups: advocates for both main candidates, political bots, and regular users. We present a characterization of how Twitter users behave during a political campaign through the language patterns analysis of tweets, which users receive more popularity during the campaign and how tweets from each candidate may have affected their mood variation, as expressed by the messages they share.","PeriodicalId":135949,"journal":{"name":"2018 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116482130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}