Jiangfeng Zeng, Ke Zhou, Xiao Ma, F. Zou, Hua Wang
{"title":"Exploiting Cluster-based Meta Paths for Link Prediction in Signed Networks","authors":"Jiangfeng Zeng, Ke Zhou, Xiao Ma, F. Zou, Hua Wang","doi":"10.1145/2983323.2983870","DOIUrl":"https://doi.org/10.1145/2983323.2983870","url":null,"abstract":"Many online social networks can be described by signed networks, where positive links signify friendships, trust and like; while negative links indicate enmity, distrust and dislike. Predicting the sign of the links in these networks has attracted a great deal of attentions in the areas of friendship recommendation and trust relationship prediction. Existing methods for sign prediction tend to rely on path-based features which are somehow limited to the sparsity problem of the network. In order to solve this issue, in this paper, we introduce a novel sign prediction model by exploiting cluster-based meta paths, which can take advantage of both local and global information of the input networks. First, cluster-based meta paths based features are constructed by incorporating the newly generated clusters through hierarchically clustering the input networks. Then, the logistic regression classifier is employed to train the model and predict the hidden signs of the links. Extensive experiments on Epinions and Slashdot datasets demonstrate the efficiency of our proposed method in terms of Accuracy and Coverage.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121777475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sequential Query Expansion using Concept Graph","authors":"Saeid Balaneshinkordan, Alexander Kotov","doi":"10.1145/2983323.2983857","DOIUrl":"https://doi.org/10.1145/2983323.2983857","url":null,"abstract":"Manually and automatically constructed concept graphs (or semantic networks), in which the nodes correspond to words or phrases and the typed edges designate semantic relationships between words and phrases, have been previously shown to be rich sources of effective latent concepts for query expansion. However, finding good expansion concepts for a given query in large and dense concept graphs is a challenging problem, since the number of candidate concepts that are related to query terms and phrases and need to be examined increases exponentially with the distance from the original query concepts. In this paper, we propose a two-stage feature-based method for sequential selection of the most effective concepts for query expansion from a concept graph. In the first stage, the proposed method weighs the concepts according to different types of computationally inexpensive features, including collection and concept graph statistics. In the second stage, a sequential concept selection algorithm utilizing more expensive features is applied to find the most effective expansion concepts at different distances from the original query concepts. Experiments on TREC datasets of different type indicate that the proposed method achieves significant improvement in retrieval accuracy over state-of-the-art methods for query expansion using concept graphs.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"132 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115889313","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Large-Scale Analysis of Viewing Behavior: Towards Measuring Satisfaction with Mobile Proactive Systems","authors":"Qi Guo, Yang Song","doi":"10.1145/2983323.2983846","DOIUrl":"https://doi.org/10.1145/2983323.2983846","url":null,"abstract":"Recently, proactive systems such as Google Now and Microsoft Cortana have become increasingly popular in reforming the way users access information on mobile devices. In these systems, relevant content is presented to users based on their context without a query in the form of information cards that do not require a click to satisfy the users. As a result, prior approaches based on clicks cannot provide reliable measurements of user satisfaction with such systems. It is also unclear how much of the previous findings regarding good abandonment with reactive Web searches can be applied to these proactive systems due to the intrinsic difference in user intent, the greater variety of content types and their presentations. In this paper, we present the first large-scale analysis of viewing behavior based on the viewport (the visible fraction of a Web page) of the mobile devices, towards measuring user satisfaction with the information cards of the mobile proactive systems. In particular, we identified and analyzed a variety of factors that may influence the viewing behavior, including biases from ranking positions, the types and attributes of the information cards, and the touch interactions with the mobile devices. We show that by modeling the various factors we can better measure user satisfaction with the mobile proactive systems, enabling stronger statistical power in large-scale online A/B testing.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129963450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"BIGtensor: Mining Billion-Scale Tensor Made Easy","authors":"Namyong Park, Byungsoo Jeon, Jungwoo Lee, U. Kang","doi":"10.1145/2983323.2983332","DOIUrl":"https://doi.org/10.1145/2983323.2983332","url":null,"abstract":"Many real-world data are naturally represented as tensors, or multi-dimensional arrays. Tensor decomposition is an important tool to analyze tensors for various applications such as latent concept discovery, trend analysis, clustering, and anomaly detection. However, existing tools for tensor analysis do not scale well for billion-scale tensors or offer limited functionalities. In this paper, we propose BIGtensor, a large-scale tensor mining library that tackles both of the above problems. Carefully designed for scalability, BIGtensor decomposes at least 100× larger tensors than the current state of the art. Furthermore, BIGtensor provides a variety of distributed tensor operations and tensor generation methods. We demonstrate how BIGtensor can help users discover hidden concepts and analyze trends from large-scale tensors that are hard to be processed by existing tensor tools.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134030317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Understanding Mobile Searcher Attention with Rich Ad Formats","authors":"Dmitry Lagun, Donal McMahon, Vidhya Navalpakkam","doi":"10.1145/2983323.2983853","DOIUrl":"https://doi.org/10.1145/2983323.2983853","url":null,"abstract":"Mobile Search experiences have evolved significantly from a few blue links that require users to click. Recent search and ad units surface instant information to the user in a variety of visually rich formats that include images, horizontal swipes, and vertical scrolls. These innovative experiences call for new metrics and models to better understand searcher behavior on mobile phones. In this paper, we study how the presence of ads and their formats impacts searcher's gaze and satisfaction. We systematically vary presentation format of the sponsored result, while controlling for other factors, such as position and quality of organic results. We experiment with several configurations of text ad and rich ad formats. Our findings indicate that showing rich ad formats improve search experience, by drawing more attention to the information-rich ad and allowing users to interact to view more offers, which increases user satisfaction with search. In addition, we extend prior work by comparing the performance of various models to infer user's gaze from viewport data. Our models improve accuracy of existing viewport-based gaze inference methods by 30% in Pearson's correlation. Together, our findings show that viewport data can be used for fast, accurate and scalable measurement of user attention on a per-element basis, for both ads as well as organic search results.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"343 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134041620","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"ACM DAVA'16: 2nd International Workshop on DAta mining meets Visual Analytics at Big Data Era","authors":"Lei Shi, Hanghang Tong, Chaoli Wang, L. Akoglu","doi":"10.1145/2983323.2988539","DOIUrl":"https://doi.org/10.1145/2983323.2988539","url":null,"abstract":"The theme of this workshop is to bridge data mining and visual analytics for information and knowledge management. The topics include, but not limited to, the following: Big data mining and visual analytics, theory and foundations -- Knowledge discovery with data mining and visual analytics technologies -- Fusion, mining and visualization of rich and heterogeneous data source -- Security and privacy issues in data mining and visual analytics systems -- Information, social and biological graph mining and visualization -- Novel methods on visualization-oriented data mining -- Visual representations and interaction techniques of data mining results -- Data management and knowledge representation including scalable data representations -- Mathematical foundations and algorithms in data mining to allow interactive visual analysis -- Analytical reasoning including the human analytic, knowledge discovery, perception, and collaborative visual analytics -- Evaluation methods for data mining algorithms and visual analytics systems -- Applications of visual analytics and data mining techniques, including but not limited to applications in science, engineering, public safety, commerce, etc. The DAVA'16 workshop includes 3 invited keynote talks, 2 paper sessions and some posters. Authors of accepted oral papers give 20-minute presentation on their papers. Three keynote speakers from both data mining and visualization give invited talks in this workshop (40-minute each). The DAVA'16 organization committee selects one paper of the highest quality to receive the DAVA'16 best paper award and a cash award of $300. An extended version of the selected papers will be recommended to Chinese of Journal Electronics (SCI-indexed) or International Journal of Software and Informatics (IJSI) as a special issue on visual analytics.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"23 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131613521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DePP: A System for Detecting Pages to Protect in Wikipedia","authors":"Kelsey Suyehira, Francesca Spezzano","doi":"10.1145/2983323.2983914","DOIUrl":"https://doi.org/10.1145/2983323.2983914","url":null,"abstract":"Wikipedia is based on the idea that anyone can make edits to the website in order to create reliable and crowd-sourced content. Yet with the cover of internet anonymity, some users make changes to the website that do not align with Wikipedia's intended uses. For this reason, Wikipedia allows for some pages of the website to become protected, where only certain users can make revisions to the page. This allows administrators to protect pages from vandalism, libel, and edit wars. However, with over five million pages on Wikipedia, it is impossible for administrators to monitor all pages and manually enforce page protection. In this paper we consider for the first time the problem of deciding whether a page should be protected or not in a collaborative environment such as Wikipedia. We formulate the problem as a binary classification task and propose a novel set of features to decide which pages to protect based on (i) users page revision behavior and (ii) page categories. We tested our system, called DePP, on a new dataset we built consisting of 13.6K pages (half protected and half unprotected) and 1.9M edits. Experimental results show that DePP reaches 93.24% classification accuracy and significantly improves over baselines.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133402802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hierarchical and Dynamic k-Path Covers","authors":"Takuya Akiba, Yosuke Yano, Naoto Mizuno","doi":"10.1145/2983323.2983712","DOIUrl":"https://doi.org/10.1145/2983323.2983712","url":null,"abstract":"A metric-independent data structure for spatial networks called k-all-path cover (k-APC) has recently been proposed. It involves a set of vertices that covers all paths of size k, and is a general indexing technique that can accelerate various path-related processes on spatial networks, such as route planning and path subsampling to name a few. Although it is a promising tool, it currently has drawbacks pertaining to its construction and maintenance. First, k-APCs, especially for large values of k, are computationally too expensive. Second, an important factor related to quality is ignored by a prevalent construction algorithm. Third, an existing algorithm only focuses on static networks. To address these issues, we propose novel k-APC construction and maintenance algorithms. Our algorithms recursively construct the layers of APCs, which we call the k-all-path cover hierarchy, by using vertex cover heuristics. This allows us to extract k-APCs for various values of k from the hierarchy. We also devise an algorithm to maintain k-APC hierarchies on dynamic networks. Our experiments showed that our construction algorithm can yield high solution quality, and has a short running time for large values of k. They also verified that our dynamic algorithm can handle an edge weight change within 40 ms.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133662013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kan Ren, Weinan Zhang, Yifei Rong, Haifeng Zhang, Yong Yu, Jun Wang
{"title":"User Response Learning for Directly Optimizing Campaign Performance in Display Advertising","authors":"Kan Ren, Weinan Zhang, Yifei Rong, Haifeng Zhang, Yong Yu, Jun Wang","doi":"10.1145/2983323.2983347","DOIUrl":"https://doi.org/10.1145/2983323.2983347","url":null,"abstract":"Learning and predicting user responses, such as clicks and conversions, are crucial for many Internet-based businesses including web search, e-commerce, and online advertising. Typically, a user response model is established by optimizing the prediction accuracy, e.g., minimizing the error between the prediction and the ground truth user response. However, in many practical cases, predicting user responses is only part of a rather larger predictive or optimization task, where on one hand, the accuracy of a user response prediction determines the final (expected) utility to be optimized, but on the other hand, its learning may also be influenced from the follow-up stochastic process. It is, thus, of great interest to optimize the entire process as a whole rather than treat them independently or sequentially. In this paper, we take real-time display advertising as an example, where the predicted user's ad click-through rate (CTR) is employed to calculate a bid for an ad impression in the second price auction. We reformulate a common logistic regression CTR model by putting it back into its subsequent bidding context: rather than minimizing the prediction error, the model parameters are learned directly by optimizing campaign profit. The gradient update resulted from our formulations naturally fine-tunes the cases where the market competition is high, leading to a more cost-effective bidding. Our experiments demonstrate that, while maintaining comparable CTR prediction accuracy, our proposed user response learning leads to campaign profit gains as much as 78.2% for offline test and 25.5% for online A/B test over strong baselines.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132728360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yael Anava, Anna Shtok, Oren Kurland, Ella Rabinovich
{"title":"A Probabilistic Fusion Framework","authors":"Yael Anava, Anna Shtok, Oren Kurland, Ella Rabinovich","doi":"10.1145/2983323.2983739","DOIUrl":"https://doi.org/10.1145/2983323.2983739","url":null,"abstract":"There are numerous methods for fusing document lists retrieved from the same corpus in response to a query. Many of these methods are based on seemingly unrelated techniques and heuristics. Herein we present a probabilistic framework for the fusion task. The framework provides a formal basis for deriving and explaining many fusion approaches and the connections between them. Instantiating the framework using various estimates yields novel fusion methods, some of which significantly outperform state-of-the-art approaches.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115666661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}