Proceedings of the Eighth ACM International Conference on Web Search and Data Mining最新文献_第2页

User Modeling for a Personal Assistant 个人助理的用户建模

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2685309

R. Guha, Vineet Gupta, V. Raghunathan, R. Srikant

引用次数: 69

Listwise Approach for Rank Aggregation in Crowdsourcing 众包中排序聚合的列表方法

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2685308

Shuzi Niu, Yanyan Lan, J. Guo, Xueqi Cheng, Lei-Ping Yu, Guoping Long

{"title":"Listwise Approach for Rank Aggregation in Crowdsourcing","authors":"Shuzi Niu, Yanyan Lan, J. Guo, Xueqi Cheng, Lei-Ping Yu, Guoping Long","doi":"10.1145/2684822.2685308","DOIUrl":"https://doi.org/10.1145/2684822.2685308","url":null,"abstract":"Inferring a gold-standard ranking over a set of objects, such as documents or images, is a key task to build test collections for various applications like Web search and recommender systems. Crowdsourcing services provide an efficient and inexpensive way to collect judgments via labeling by sets of annotators. We thus study the problem of finding a consensus ranking from crowdsourced judgments. In contrast to conventional rank aggregation methods which minimize the distance between predicted ranking and input judgments from either pointwise or pairwise perspective, we argue that it is critical to consider the distance in a listwise way to emphasize the position importance in ranking. Therefore, we introduce a new listwise approach in this paper, where ranking measure based objective functions are utilized for optimization. In addition, we also incorporate the annotator quality into our model since the reliability of annotators can vary significantly in crowdsourcing. For optimization, we transform the optimization problem to the Linear Sum Assignment Problem, and then solve it by a very efficient algorithm named CrowdAgg guaranteeing the optimal solution. Experimental results on two benchmark data sets from different crowdsourcing tasks show that our algorithm is much more effective, efficient and robust than traditional methods.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127371353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Semantic Matching in APP Search APP搜索中的语义匹配

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2697046

Juchao Zhuo, Zeqian Huang, Yunfeng Liu, Zhanhui Kang, Xun Cao, Mingzhi Li, Long Jin

引用次数: 9

Chronological Scientific Information Recommendation via Supervised Dynamic Topic Modeling 基于监督动态主题建模的时间科学信息推荐

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2697036

Zhuoren Jiang

{"title":"Chronological Scientific Information Recommendation via Supervised Dynamic Topic Modeling","authors":"Zhuoren Jiang","doi":"10.1145/2684822.2697036","DOIUrl":"https://doi.org/10.1145/2684822.2697036","url":null,"abstract":"Scientific information recommendation is crucial to assist scholars for their researches. Citation recommendation is an important field of scientific recommendation. Traditional approaches ignore the chronological nature of the citation recommendation task. In this study, I propose the \"Chronological Citation Recommendation,\" which assumes initial user information need could shift while they are looking for the papers in different time slices. Specifically, I employed a supervised dynamic topic model to characterize the content \"time-varying\" dynamics and constructed a novel heterogeneous graph that contains dynamic topic-based information, time-decay citation information and word-based information. I applied different meta-paths for different ranking hypotheses, which carried different types of information for citation recommendation in different time slices along with information need shifting. I plan to generate the final \"Chronological Citation Recommendation\" rankings by feature integration using Learning to Rank. \"Chronological Citation Recommendation\" will recommend time-series ranking lists based on initial user textual information need. Preliminary experiments on the ACM corpus show that chronological citation recommendation will significantly improve the citation recommendation performance.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"580 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132413735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Topics, Tasks & Beyond: Learning Representations for Personalization 主题、任务及其他:个性化学习表征

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2697037

Rishabh Mehrotra

{"title":"Topics, Tasks & Beyond: Learning Representations for Personalization","authors":"Rishabh Mehrotra","doi":"10.1145/2684822.2697037","DOIUrl":"https://doi.org/10.1145/2684822.2697037","url":null,"abstract":"Accurate understanding of a user's interests, preferences and behaviours is possibly one of the most critical research challenges faced while developing personalized systems for behavior targeting and information access. We intend to develop comprehensive latent variable models for web search personalization which jointly models user's topical interests along with user's click based relevance preferences while at the same time taking into account user's intended search tasks along with information about other similar users. We further augment this model by incorporating topic-level relevance parameters, which, to the best of our knowledge, is the first attempt at modeling result ranking preferences at the topic level. Additionally, we intend to explore the possibility of modeling users in terms of the search tasks they perform thereby coupling users' topical interests with their search task behavior to learn user representations. Finally, we wish to evaluate the proposition of extending user representations to hierarchical structures as an alternative to existing flat representations. The evaluation of these alternative approaches for user modeling is based on their performance on a variety of tasks such as collaborative query recommendations, user cohort modeling and search result personalization. This proposal provides the motivation to pursue these research directions, summarizes key research problems being targeted, glances through potential ways of tackling these research challenges and highlights some initial results obtained.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"243 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132315349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Finding Subgraphs with Maximum Total Density and Limited Overlap 寻找具有最大总密度和有限重叠的子图

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2685298

Oana Balalau, F. Bonchi, T-H. Hubert Chan, Francesco Gullo, Mauro Sozio

{"title":"Finding Subgraphs with Maximum Total Density and Limited Overlap","authors":"Oana Balalau, F. Bonchi, T-H. Hubert Chan, Francesco Gullo, Mauro Sozio","doi":"10.1145/2684822.2685298","DOIUrl":"https://doi.org/10.1145/2684822.2685298","url":null,"abstract":"Finding dense subgraphs in large graphs is a key primitive in a variety of real-world application domains, encompassing social network analytics, event detection, biology, and finance. In most such applications, one typically aims at finding several (possibly overlapping) dense subgraphs which might correspond to communities in social networks or interesting events. While a large amount of work is devoted to finding a single densest subgraph, perhaps surprisingly, the problem of finding several dense subgraphs with limited overlap has not been studied in a principled way, to the best of our knowledge. In this work we define and study a natural generalization of the densest subgraph problem, where the main goal is to find at most $k$ subgraphs with maximum total aggregate density, while satisfying an upper bound on the pairwise Jaccard coefficient between the sets of nodes of the subgraphs. After showing that such a problem is NP-Hard, we devise an efficient algorithm that comes with provable guarantees in some cases of interest, as well as, an efficient practical heuristic. Our extensive evaluation on large real-world graphs confirms the efficiency and effectiveness of our algorithms.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128933750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 77

Mining Groups Stability in Ubiquitous and Social Environments: Communities, Classes and Clusters 无所不在和社会环境中的采矿群体稳定性:社区、阶级和集群

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2697034

Mark Kibanov

{"title":"Mining Groups Stability in Ubiquitous and Social Environments: Communities, Classes and Clusters","authors":"Mark Kibanov","doi":"10.1145/2684822.2697034","DOIUrl":"https://doi.org/10.1145/2684822.2697034","url":null,"abstract":"Ubiquitous Computing is an emerging research area of computer science. Similarly, social network analysis and mining became very important in the last years. We aim to combine these two research areas to explore the nature of processes happening around users. The presented research focuses on exploring and analyzing different groups of persons or entities (communities, clusters and classes), their stability and semantics. An example of ubiquitous social data are social networks captured during scientific conferences using face-to-face RFID proximity tags. Another example of ubiquitous data is crowd-generated environmental sensor data. In this paper we generalize various problems connected to these and further datasets and consider them as a task for measuring group stability. Group stability can be used to improve state-of-the-art methods to analyze data. We also aim to improve the performance of different data mining algorithms, eg. by better handling of data with a skewed density distribution. We describe significant results some experiments that show how the presented approach can be applied and discuss the planned experiments.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115154705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Fast and Space-Efficient Entity Linking for Queries 用于查询的快速和节省空间的实体链接

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2685317

Roi Blanco, G. Ottaviano, E. Meij

{"title":"Fast and Space-Efficient Entity Linking for Queries","authors":"Roi Blanco, G. Ottaviano, E. Meij","doi":"10.1145/2684822.2685317","DOIUrl":"https://doi.org/10.1145/2684822.2685317","url":null,"abstract":"Entity linking deals with identifying entities from a knowledge base in a given piece of text and has become a fundamental building block for web search engines, enabling numerous downstream improvements from better document ranking to enhanced search results pages. A key problem in the context of web search queries is that this process needs to run under severe time constraints as it has to be performed before any actual retrieval takes place, typically within milliseconds. In this paper we propose a probabilistic model that leverages user-generated information on the web to link queries to entities in a knowledge base. There are three key ingredients that make the algorithm fast and space-efficient. First, the linking process ignores any dependencies between the different entity candidates, which allows for a O(k2) implementation in the number of query terms. Second, we leverage hashing and compression techniques to reduce the memory footprint. Finally, to equip the algorithm with contextual knowledge without sacrificing speed, we factor the distance between distributional semantics of the query words and entities into the model. We show that our solution significantly outperforms several state-of-the-art baselines by more than 14% while being able to process queries in sub-millisecond times---at least two orders of magnitude faster than existing systems.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114919690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 202

The Power of Random Neighbors in Social Networks 随机邻居在社交网络中的作用

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2685293

Silvio Lattanzi, Yaron Singer

{"title":"The Power of Random Neighbors in Social Networks","authors":"Silvio Lattanzi, Yaron Singer","doi":"10.1145/2684822.2685293","DOIUrl":"https://doi.org/10.1145/2684822.2685293","url":null,"abstract":"The friendship paradox is a sociological phenomenon first discovered by Feld which states that individuals are likely to have fewer friends than their friends do, on average. This phenomenon has become common knowledge, has several interesting applications, and has also been observed in various data sets. In his seminal paper Feld provides an intuitive explanation by showing that in any graph the average degree of edges in the graph is an upper bound on the average degree of nodes. Despite the appeal of this argument, it does not prove the existence of the friendship paradox. In fact, it is easy to construct networks -- even with power law degree distributions -- where the ratio between the average degree of neighbors and the average degree of nodes is high, but all nodes have the exact same degree as their neighbors. Which models, then, explain the friendship paradox? In this paper we give a strong characterization that provides a formal understanding of the friendship paradox. We show that for any power law graph with exponential parameter in (1,3), when every edge is rewired with constant probability, the friendship paradox holds, i.e. there is an asymptotic gap between the average degree of the sample of polylogarithmic size and the average degree of a random set of its neighbors of equal size. To examine this characterization on real data, we performed several experiments on social network data sets that complement our theoretical analysis. We also discuss the applications of our result to influence maximization.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128183550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30

FLAME: A Probabilistic Model Combining Aspect Based Opinion Mining and Collaborative Filtering 基于方面的意见挖掘和协同过滤相结合的概率模型FLAME

Proceedings of the Eighth ACM International Conference on Web Search and Data Mining Pub Date : 2015-02-02 DOI: 10.1145/2684822.2685291

Yao Wu, M. Ester

{"title":"FLAME: A Probabilistic Model Combining Aspect Based Opinion Mining and Collaborative Filtering","authors":"Yao Wu, M. Ester","doi":"10.1145/2684822.2685291","DOIUrl":"https://doi.org/10.1145/2684822.2685291","url":null,"abstract":"Aspect-based opinion mining from online reviews has attracted a lot of attention recently. Given a set of reviews, the main task of aspect-based opinion mining is to extract major aspects of the items and to infer the latent aspect ratings from each review. However, users may have different preferences which might lead to different opinions on the same aspect of an item. Even if fine-grained aspect rating analysis is provided for each review, it is still difficult for a user to judge whether a specific aspect of an item meets his own expectation. In this paper, we study the problem of estimating personalized sentiment polarities on different aspects of the items. We propose a unified probabilistic model called Factorized Latent Aspect ModEl (FLAME), which combines the advantages of collaborative filtering and aspect based opinion mining. FLAME learns users' personalized preferences on different aspects from their past reviews, and predicts users' aspect ratings on new items by collective intelligence. Experiments on two online review datasets show that FLAME outperforms state-of-the-art methods on the tasks of aspect identification and aspect rating prediction.","PeriodicalId":179443,"journal":{"name":"Proceedings of the Eighth ACM International Conference on Web Search and Data Mining","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131605812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 146