{"title":"Exploiting homophily effect for trust prediction","authors":"Jiliang Tang, Huiji Gao, Xia Hu, Huan Liu","doi":"10.1145/2433396.2433405","DOIUrl":"https://doi.org/10.1145/2433396.2433405","url":null,"abstract":"Trust plays a crucial role for online users who seek reliable information. However, in reality, user-specified trust relations are very sparse, i.e., a tiny number of pairs of users with trust relations are buried in a disproportionately large number of pairs without trust relations, making trust prediction a daunting task. As an important social concept, however, trust has received growing attention and interest. Social theories are developed for understanding trust. Homophily is one of the most important theories that explain why trust relations are established. Exploiting the homophily effect for trust prediction provides challenges and opportunities. In this paper, we embark on the challenges to investigate the trust prediction problem with the homophily effect. First, we delineate how it differs from existing approaches to trust prediction in an unsupervised setting. Next, we formulate the new trust prediction problem into an optimization problem integrated with homophily, empirically evaluate our approach on two datasets from real-world product review sites, and compare with representative algorithms to gain a deep understanding of the role of homophily in trust prediction.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121727761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Building user profiles to improve user experience in recommender systems","authors":"A. Lacerda, N. Ziviani","doi":"10.1145/2433396.2433492","DOIUrl":"https://doi.org/10.1145/2433396.2433492","url":null,"abstract":"Recommender systems are quickly becoming ubiquitous in many Web applications, including e-commerce, social media channels, content providers, among others. These systems act as an enabling mechanism designed to overcome the information overload problem by improving browsing and consumption experience. Crucial to the performance of a recommender system is the accuracy of the user profiles used to represent the interests of the users. In this proposal, we analyze three different aspects of user profiling: (i) selecting the most informative events from the interaction between users and the system, (ii) combining different recommendation algorithms to (iii) including trust-aware information in user profiles to improve the accuracy of recommender systems.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124064734","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Henrique Pinto, J. Almeida, Marcos André Gonçalves
{"title":"Using early view patterns to predict the popularity of youtube videos","authors":"Henrique Pinto, J. Almeida, Marcos André Gonçalves","doi":"10.1145/2433396.2433443","DOIUrl":"https://doi.org/10.1145/2433396.2433443","url":null,"abstract":"Predicting Web content popularity is an important task for supporting the design and evaluation of a wide range of systems, from targeted advertising to effective search and recommendation services. We here present two simple models for predicting the future popularity of Web content based on historical information given by early popularity measures. Our approach is validated on datasets consisting of videos from the widely used YouTube video-sharing portal. Our experimental results show that, compared to a state-of-the-art baseline model, our proposed models lead to significant decreases in relative squared errors, reaching up to 20% reduction on average, and larger reductions (of up to 71%) for videos that experience a high peak in popularity in their early days followed by a sharp decrease in popularity.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134000737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jun Zhang, Chaokun Wang, Jianmin Wang, Philip S. Yu
{"title":"LaFT-tree: perceiving the expansion trace of one's circle of friends in online social networks","authors":"Jun Zhang, Chaokun Wang, Jianmin Wang, Philip S. Yu","doi":"10.1145/2433396.2433472","DOIUrl":"https://doi.org/10.1145/2433396.2433472","url":null,"abstract":"Many patterns have been discovered to explain and analyze how people make friends. Among them is the triadic closure, supported by the principle of the transitivity of friendship, which means for an individual the friends of her friend are more likely to become her new friends. However, people's motivations under this principle haven't been well studied, and it's still unknown that how this principle works in diverse situations. In this paper, we try to study this principle deeply based on the behavior modeling. We study how one expands her egocentric network via her friends, also called intermediaries, based on the transitivity of friendship. We propose LaFT-Tree, a tree-based representation of friendship formation inspired from triadic closure. LaFT-Tree provides a hierarchical view of the flat structure of one's egocentric network by visualizing the expansion trace of one's egocentric network. We model people's friend-making behaviors using LaFT-LDA, a generative model for LaFT-Tree learning. The proposed model is evaluated on both synthetic and real-world social networks and experimental results demonstrate the effectiveness of LaFT-LDA for LaFT-Tree inference. We also present some interesting applications of the LaFT-Tree, showing that our model can be generalized and benefit other social network analysis tasks.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131339576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NCDawareRank: a novel ranking method that exploits the decomposable structure of the web","authors":"A. Nikolakopoulos, J. Garofalakis","doi":"10.1145/2433396.2433415","DOIUrl":"https://doi.org/10.1145/2433396.2433415","url":null,"abstract":"Research about the topological characteristics of the hyperlink graph has shown that Web possesses a nested block structure, indicative of its innate hierarchical organization. This crucial observation opens the way for new approaches that can usefully regard Web as a Nearly Completely Decomposable(NCD) system; In recent years, such approaches gave birth to various efficient methods and algorithms that exploit NCD from a computational point of view and manage to considerably accelerate the extraction of the PageRank vector. However, very little have been done towards the qualitative exploitation of NCD. In this paper we propose NCDawareRank, a novel ranking method that uses the intuition behind NCD to generalize and refine PageRank. NCDawareRank considers both the link structure and the hierarchical nature of the Web in a way that preserves the mathematically attractive characteristics of PageRank and at the same time manages to successfully resolve many of its known problems, including Web Spamming Susceptibility and Biased Ranking of Newly Emerging Pages. Experimental results show that NCDawareRank is more resistant to direct manipulation, alleviates the problems caused by the sparseness of the link graph and assigns more reasonable ranking scores to newly added pages, while maintaining the ability to be easily implemented on a large-scale and in a computationally efficient manner.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133307893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimizing parallel algorithms for all pairs similarity search","authors":"Maha Alabduljalil, Xun Tang, Tao Yang","doi":"10.1145/2433396.2433422","DOIUrl":"https://doi.org/10.1145/2433396.2433422","url":null,"abstract":"All pairs similarity search is used in many web search and data mining applications. Previous work has used comparison filtering, inverted indexing, and parallel accumulation of partial intermediate results to expedite its execution. However, shuffling intermediate results can incur significant communication overhead as data scales up. This paper studies a scalable two-step approach called Partition-based Similarity Search (PSS) which incorporates several optimization techniques. First, PSS uses a static partitioning algorithm that places dissimilar vectors into different groups and balance the comparison workload with a circular assignment. Second, PSS executes comparison tasks in parallel, each using a hybrid data structure that combines the advantages of forward and inverted indexing. Our evaluation results show that the proposed approach leads to an early elimination of unnecessary I/O and data communication while sustaining parallel efficiency. As a result, it improves performance by an order of magnitude when dealing with large datasets.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117187082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised graph-based topic labelling using dbpedia","authors":"Ioana Hulpus, Conor Hayes, Marcel Karnstedt, Derek Greene","doi":"10.1145/2433396.2433454","DOIUrl":"https://doi.org/10.1145/2433396.2433454","url":null,"abstract":"Automated topic labelling brings benefits for users aiming at analysing and understanding document collections, as well as for search engines targetting at the linkage between groups of words and their inherent topics. Current approaches to achieve this suffer in quality, but we argue their performances might be improved by setting the focus on the structure in the data. Building upon research for concept disambiguation and linking to DBpedia, we are taking a novel approach to topic labelling by making use of structured data exposed by DBpedia. We start from the hypothesis that words co-occuring in text likely refer to concepts that belong closely together in the DBpedia graph. Using graph centrality measures, we show that we are able to identify the concepts that best represent the topics. We comparatively evaluate our graph-based approach and the standard text-based approach, on topics extracted from three corpora, based on results gathered in a crowd-sourcing experiment. Our research shows that graph-based analysis of DBpedia can achieve better results for topic labelling in terms of both precision and topic coverage.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115846222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Workshop on large-scale and distributed systems for information retrieval (LSDS-IR 2013)","authors":"N. Tonellotto, C. Macdonald, I. S. Altingövde","doi":"10.1145/2433396.2433505","DOIUrl":"https://doi.org/10.1145/2433396.2433505","url":null,"abstract":"The LSDS-IR'13 workshop aims to bring together both information retrieval practitioners from industry, as well as academic researchers concerned with efficient and distributed IR systems. The workshop also welcomes contributions that propose different ways of leveraging diversity and multiplicity of resources available in distributed systems. The main goal of the workshop is to attract people from industry and academia to present and discuss ideas, problems and results in efficiency of large scale and distributed information retrieval systems, and to foster their participation to the WSDM conference.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122534616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Differences in search engine evaluations between query owners and non-owners","authors":"A. Chouldechova, David Mease","doi":"10.1145/2433396.2433411","DOIUrl":"https://doi.org/10.1145/2433396.2433411","url":null,"abstract":"The query-document relevance judgments used in web search engine evaluation are traditionally provided by human assessors who have no particular association with the specific queries selected for the evaluation. Most commonly, queries are randomly sampled from search logs and in turn randomly assigned to the human assessors. In this paper, we consider a very different approach in which we instead ask the human assessors to provide their own queries from their recent search experiences. Using these queries as our sample, we compare the relevance judgments from the \"owners\" of the queries to the relevance judgments of the non-owners. We conduct experiments which reveal that query ownership has a substantial and beneficial impact on the accuracy of relevance judgments. In particular, we observe that owners are more consistently able to distinguish a higher quality set of search results from a lower quality set in a blind comparison. The implication for web search evaluation is that query owners provide more valuable relevance judgments than non-owners, presumably due to the background knowledge associated with their queries. We quantify the benefit of using owner assessments versus non-owner assessments in terms of sample size reduction. We also touch on some of the practical challenges associated with using query owners as assessors.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114062474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Takeshi Kurashima, Tomoharu Iwata, Takahide Hoshide, Noriko Takaya, Ko Fujimura
{"title":"Geo topic model: joint modeling of user's activity area and interests for location recommendation","authors":"Takeshi Kurashima, Tomoharu Iwata, Takahide Hoshide, Noriko Takaya, Ko Fujimura","doi":"10.1145/2433396.2433444","DOIUrl":"https://doi.org/10.1145/2433396.2433444","url":null,"abstract":"This paper proposes a method that analyzes the location log data of multiple users to recommend locations to be visited. The method uses our new topic model, called Geo Topic Model, that can jointly estimate both the user's interests and activity area hosting the user's home, office and other personal places. By explicitly modeling geographical features of locations and users, the user's interests in other features of locations, which we call latent topics, can be inferred effectively. The topic interests estimated by our model 1) lead to high accuracy in predicting visit behavior as driven by personal interests, 2) make possible the generation of recommendations when the user is in an unfamiliar area (e.g. sightseeing), and 3) enable the recommender system to suggest an interpretable representation of the user profile that can be customized by the user. Experiments are conducted using real location logs of landmark and restaurant visits to evaluate the recommendation performance of the proposed method in terms of the accuracy of predicting visit selections. We also show that our model can estimate latent features of locations such as art, nature and atmosphere as latent topics, and describe each user's preference based on them.","PeriodicalId":324799,"journal":{"name":"Proceedings of the sixth ACM international conference on Web search and data mining","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128087960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}