{"title":"Seeking provenance of information using social media","authors":"Pritam Gundecha, Zhuo Feng, Huan Liu","doi":"10.1145/2505515.2505633","DOIUrl":"https://doi.org/10.1145/2505515.2505633","url":null,"abstract":"Social media propagates breaking news and disinformation alike fast and on an unsurpassed scale. Because of its democratizing nature, social media users can easily produce, receive, and propagate a piece of information without necessarily providing traceable information. Thus, there are no means for a user to verify the provenance (aka sources or originators) of information. The disinformation can cause tragic consequences to society and individuals. This work aims to take advantage of characteristics of social media to provide a solution to the problem of lacking traceable information. Such knowledge can provide additional context to received information such that a user can assess how much value, trust, and validity should be placed in it. In this paper, we are studying a novel research problem that facilitates the seeking of the provenance of information for a few known recipients (less than 1% of the total recipients) by recovering the paths it has taken from its originators. The proposed methodology exploits easily computable node centralities of a large social media network. The experimental results with Facebook and Twitter datasets show that the proposed mechanism is effective in correctly identifying the additional recipients and seeking the provenance of information.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"116 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74679947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PIKM 2013: the 6th ACM workshop for ph.d. students in information and knowledge management","authors":"Fabian M. Suchanek, A. Nica","doi":"10.1145/2505515.2505817","DOIUrl":"https://doi.org/10.1145/2505515.2505817","url":null,"abstract":"The PIKM workshop gives Ph.D. students an opportunity to present their dissertation proposals at a global stage. Similarly to the CIKM, the PIKM workshop covers a wide range of topics in the areas of databases, information retrieval and knowledge management. Interdisciplinary work across these tracks is particularly encouraged.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"4 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74250422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling temporal effects of human mobile behavior on location-based social networks","authors":"Huiji Gao, Jiliang Tang, Xia Hu, Huan Liu","doi":"10.1145/2505515.2505616","DOIUrl":"https://doi.org/10.1145/2505515.2505616","url":null,"abstract":"The rapid growth of location-based social networks (LBSNs) invigorates an increasing number of LBSN users, providing an unprecedented opportunity to study human mobile behavior from spatial, temporal, and social aspects. Among these aspects, temporal effects offer an essential contextual cue for inferring a user's movement. Strong temporal cyclic patterns have been observed in user movement in LBSNs with their correlated spatial and social effects (i.e., temporal correlations). It is a propitious time to model these temporal effects (patterns and correlations) on a user's mobile behavior. In this paper, we present the first comprehensive study of temporal effects on LBSNs. We propose a general framework to exploit and model temporal cyclic patterns and their relationships with spatial and social data. The experimental results on two real-world LBSN datasets validate the power of temporal effects in capturing user mobile behavior, and demonstrate the ability of our framework to select the most effective location prediction algorithm under various combinations of prediction models.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76951472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Merged aggregate nearest neighbor query processing in road networks","authors":"Weiwei Sun, Chong Chen, Baihua Zheng, Chunan Chen, Liang Zhu, Weimo Liu, Y. Huang","doi":"10.1145/2505515.2505738","DOIUrl":"https://doi.org/10.1145/2505515.2505738","url":null,"abstract":"Aggregate nearest neighbor query, which returns a common interesting point that minimizes the aggregate distance for a given query point set, is one of the most important operations in spatial databases and their application domains. This paper addresses the problem of finding the aggregate nearest neighbor for a merged set that consists of the given query point set and multiple points needed to be selected from a candidate set, which we name as merged aggregate nearest neighbor(MANN) query. This paper proposes an effective algorithm to process MANN query in road networks based on our pruning strategies. Extensive experiments are conducted to examine the behaviors of the solutions and the overall experiments show that our strategies to minimize the response time are effective and achieve several orders of magnitude speedup compared with the baseline methods.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77544814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Graph-of-word and TW-IDF: new approach to ad hoc IR","authors":"F. Rousseau, M. Vazirgiannis","doi":"10.1145/2505515.2505671","DOIUrl":"https://doi.org/10.1145/2505515.2505671","url":null,"abstract":"In this paper, we introduce novel document representation (graph-of-word) and retrieval model (TW-IDF) for ad hoc IR. Questioning the term independence assumption behind the traditional bag-of-word model, we propose a different representation of a document that captures the relationships between the terms using an unweighted directed graph of terms. From this graph, we extract at indexing time meaningful term weights (TW) that replace traditional term frequencies (TF) and from which we define a novel scoring function, namely TW-IDF, by analogy with TF-IDF. This approach leads to a retrieval model that consistently and significantly outperforms BM25 and in some cases its extension BM25+ on various standard TREC datasets. In particular, experiments show that counting the number of different contexts in which a term occurs inside a document is more effective and relevant to search than considering an overall concave term frequency in the context of ad hoc IR.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77661939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Joint learning on sentiment and emotion classification","authors":"Wei Gao, Shoushan Li, Sophia Yat-Mei Lee, Guodong Zhou, Chu-Ren Huang","doi":"10.1145/2505515.2507830","DOIUrl":"https://doi.org/10.1145/2505515.2507830","url":null,"abstract":"Sentiment and emotion classification have been popularly but separately studied in natural language processing. In this paper, we address joint learning on sentiment and emotion classification where both the labeled data for sentiment and emotion classification are available. The objective of this joint-learning is to benefit the two tasks from each other for improving their performances. Specifically, an extra data set that is annotated with both sentiment and emotion labels are employed to estimate the transformation probability between the two kinds of labels. Furthermore, the transformation probability is leveraged to transfer the classification labels to benefit the two tasks from each other. Empirical studies demonstrate the effectiveness of our approach for the novel joint learning task.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"114 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76872896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Zero-shot video retrieval using content and concepts","authors":"Jeffrey Dalton, James Allan, P. Mirajkar","doi":"10.1145/2505515.2507880","DOIUrl":"https://doi.org/10.1145/2505515.2507880","url":null,"abstract":"Recent research in video retrieval has been successful at finding videos when the query consists of tens or hundreds of sample relevant videos for training supervised models. Instead, we investigate unsupervised zero-shot retrieval where no training videos are provided: a query consists only of a text statement. For retrieval, we use text extracted from images in the videos, text recognized in the speech of its audio track, as well as automatically detected semantically meaningful visual video concepts identified with widely varying confidence in the videos. In this work we introduce a new method for automatically identifying relevant concepts given a text query using the Markov Random Field (MRF) retrieval framework. We use source expansion to build rich textual representations of semantic video concepts from large external sources such as the web. We find that concept-based retrieval significantly outperforms text based approaches in recall. Using an evaluation derived from the TRECVID MED'11 track, we present early results that an approach using multi-modal fusion can compensate for inadequacies in each modality, resulting in substantial effectiveness gains. With relevance feedback, our approach provides additional improvements of over 50%.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"32 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78870620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Beyond data: from user information to business value through personalized recommendations and consumer science","authors":"X. Amatriain","doi":"10.1145/2505515.2514701","DOIUrl":"https://doi.org/10.1145/2505515.2514701","url":null,"abstract":"Since the Netflix $1 million Prize, announced in 2006, Netflix has been known for having personalization at the core of our product. Our current product offering is nowadays focused around instant video streaming, and our data is now many orders of magnitude larger. Not only do we have many more users in many more countries, but we also receive many more streams of data. Besides the ratings, we now also use information such as what our members play, browse, or search. In this paper I will discuss the different approaches we follow to deal with these large streams of user data in order to extract information for personalizing our service. I will describe some of the machine learning models used, and their application in the service. I will also describe our data-driven approach to innovation that combines rapid offline explorations as well as online A/B testing. This approach enables us to convert user information into real and measurable business value.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80408311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving pseudo-relevance feedback via tweet selection","authors":"Taiki Miyanishi, Kazuhiro Seki, K. Uehara","doi":"10.1145/2505515.2505701","DOIUrl":"https://doi.org/10.1145/2505515.2505701","url":null,"abstract":"Query expansion methods using pseudo-relevance feedback have been shown effective for microblog search because they can solve vocabulary mismatch problems often seen in searching short documents such as Twitter messages (tweets), which are limited to 140 characters. Pseudo-relevance feedback assumes that the top ranked documents in the initial search results are relevant and that they contain topic-related words appropriate for relevance feedback. However, those assumptions do not always hold in reality because the initial search results often contain many irrelevant documents. In such a case, only a few of the suggested expansion words may be useful with many others being useless or even harmful. To overcome the limitation of pseudo-relevance feedback for microblog search, we propose a novel query expansion method based on two-stage relevance feedback that models search interests by manual tweet selection and integration of lexical and temporal evidence into its relevance model. Our experiments using a corpus of microblog data (the Tweets2011 corpus) demonstrate that the proposed two-stage relevance feedback approaches considerably improve search result relevance over almost all topics.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"98 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80545887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Augmenting web search surrogates with images","authors":"Robert G. Capra, Jaime Arguello, Falk Scholer","doi":"10.1145/2505515.2505714","DOIUrl":"https://doi.org/10.1145/2505515.2505714","url":null,"abstract":"While images are commonly used in search result presentation for vertical domains such as shopping and news, web search results surrogates remain primarily text-based. In this paper, we present results of two large-scale user studies to examine the effects of augmenting text-based surrogates with images extracted from the underlying webpage. We evaluate effectiveness and efficiency at both the individual surrogate level and at the results page level. Additionally, we investigate the influence of two factors: the goodness of the image in terms of representing the underlying page content, and the diversity of the results on a results page. Our results show that at the individual surrogate level, good images provide only a small benefit in judgment accuracy versus text-only surrogates, with a slight increase in judgment time. At the results page level, surrogates with good images had similar effectiveness and efficiency compared to the text-only condition. However, in situations where the results page items had diverse senses, surrogates with images had higher click precision versus text-only ones. Results of these studies show tradeoffs in the use of images in web search surrogates, and highlight particular situations where they can provide benefits.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"54 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81673012","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}