Jiacai Ni, Guoliang Li, Jun Zhang, Lei Li, Jianhua Feng
{"title":"Adapt: adaptive database schema design for multi-tenant applications","authors":"Jiacai Ni, Guoliang Li, Jun Zhang, Lei Li, Jianhua Feng","doi":"10.1145/2396761.2398601","DOIUrl":"https://doi.org/10.1145/2396761.2398601","url":null,"abstract":"Multi-tenant data management is a major application of software as a Service (SaaS). Many companies outsource their data to a third party which hosts a multi-tenant database system to provide data management service. The system should have high performance, low space and excellent scalability. One big challenge is to devise a high-quality database schema. Independent Tables Shared Instances and Shared Tables Shared Instances are two state-of-the-art methods. However, the former has poor scalability, while the latter achieves good scalability at the expense of poor performance and high space overhead. In this paper, we trade-off between the two methods and propose an adaptive database schema design approach to achieve good scalability and high performance with low space. To this end, we identify the important attributes and use them to generate a base table. For other attributes, we construct supplementary tables. We propose a cost-based model to adaptively generate the tables above. Our method has the following advantages. First, our method achieves high scalability. Second, our method can trade-off performance and space requirement. Third, our method can be easily applied to existing databases (e.g., MySQL) with minor revisions. Fourth, our method can adapt to any schemas and query workloads. Experimental results show our method achieves high performance and good scalability with low space and outperforms state-of-the-art method.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130740867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Zhukovskiy, D. Vinogradov, Gleb Gusev, P. Serdyukov, A. Raigorodskii
{"title":"Recency-sensitive model of web page authority","authors":"M. Zhukovskiy, D. Vinogradov, Gleb Gusev, P. Serdyukov, A. Raigorodskii","doi":"10.1145/2396761.2398708","DOIUrl":"https://doi.org/10.1145/2396761.2398708","url":null,"abstract":"Traditional link-based web ranking algorithms run on a single web snapshot without concern of the dynamics of web pages and links. In particular, the correlation of web pages freshness and their classic PageRank is negative (see [11]). For this reason, in recent years a number of authors introduce some algorithms of PageRank actualization. We introduce our new algorithm called Actual PageRank, which generalizes some previous approaches and therefore provides better capability for capturing the dynamics of the Web. To the best of our knowledge we are the first to conduct ranking evaluations of a fresh-aware variation of PageRank on a large data set. The results demonstrate that our method achieves more relevant and fresh results than both classic PageRank and its \"fresh\" modifications.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"465 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132415385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiwoon Ha, Soon-Hyoung Kwon, Sang-Wook Kim, C. Faloutsos, Sunju Park
{"title":"Top-N recommendation through belief propagation","authors":"Jiwoon Ha, Soon-Hyoung Kwon, Sang-Wook Kim, C. Faloutsos, Sunju Park","doi":"10.1145/2396761.2398636","DOIUrl":"https://doi.org/10.1145/2396761.2398636","url":null,"abstract":"The top-n recommendation focuses on finding the top-n items that the target user is likely to purchase rather than predicting his/her ratings on individual items. In this paper, we propose a novel method that provides top-n recommendation by probabilistically determining the target user's preference on items. This method models the purchasing relationships between users and items as a bipartite graph and employs Belief Propagation to compute the preference of the target user on items. We analyze the proposed method in detail by examining the changes in recommendation accuracy under different parameter settings. We also show that the proposed method is up to 40% more accurate than an existing method by comparing it with an RWR-based method via extensive experiments.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130410438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PathRank: a novel node ranking measure on a heterogeneous graph for recommender systems","authors":"S. Lee, Sungchan Park, Minsuk Kahng, Sang-goo Lee","doi":"10.1145/2396761.2398488","DOIUrl":"https://doi.org/10.1145/2396761.2398488","url":null,"abstract":"In this paper, we present a novel random-walk based node ranking measure, PathRank, which is defined on a heterogeneous graph by extending the Personalized PageRank algorithm. Not only can our proposed measure exploit the semantics behind the different types of nodes and edges in a heterogeneous graph, but also it can emulate various recommendation semantics such as collaborative filtering, content-based filtering, and their combinations. The experimental results show that PathRank can produce more various and effective recommendation results compared to existing approaches.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127879461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enabling direct interest-aware audience selection","authors":"A. Fuxman, A. Kannan, Z. Li, Panayiotis Tsaparas","doi":"10.1145/2396761.2396836","DOIUrl":"https://doi.org/10.1145/2396761.2396836","url":null,"abstract":"Advertisers typically have a fairly accurate idea of the interests of their target audience. However, today's online advertising systems are unable to leverage this information. The reasons are two-fold. First, there is no agreed upon vocabulary of interests for advertisers and advertising systems to communicate. More importantly, advertising systems lack a mechanism for mapping users to the interest vocabulary. In this paper, we tackle both problems. We present a system for direct interest-aware audience selection. This system takes the query histories of search engine users as input, extracts their interests, and describes them with interpretable labels. The labels are not drawn from a predefined taxonomy, but rather dynamically generated from the query histories, and are thus easy for the advertisers to interpret and use for targeting users. In addition, the system enables seamless addition of interest labels that may be provided by the advertiser.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127955726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A constraint to automatically regulate document-length normalisation","authors":"Ronan Cummins, C. O'Riordan","doi":"10.1145/2396761.2398662","DOIUrl":"https://doi.org/10.1145/2396761.2398662","url":null,"abstract":"Retrieval functions in information retrieval (IR) are fundamental to the effectiveness of search systems. However, considerable parameter tuning is often needed to increase the effectiveness of the retrieval. Document length normalisation is one such aspect that requires tuning on a per-query and per-collection basis for many retrieval functions. In this paper, we develop an approach that regularises the level of normalisation to apply on a per-query basis. We formally describe the interaction between query-terms and document length normalisation using a constraint. We then develop a general pre-retrieval approach to adapt a number of state-of-the-art ranking functions so that they adhere to the constraint. Finally, we empirically demonstrate that the adapted retrieval functions outperform default versions of the original retrieval functions, and perform at least comparably to tuned versions of the original functions, on a number of datasets. Essentially this regulates the normalisation parameter in a number of retrieval functions on a per-query basis in a principled manner.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128815091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning to rank by aggregating expert preferences","authors":"M. Volkovs, H. Larochelle, R. Zemel","doi":"10.1145/2396761.2396868","DOIUrl":"https://doi.org/10.1145/2396761.2396868","url":null,"abstract":"We present a general treatment of the problem of aggregating preferences from several experts into a consensus ranking, in the context where information about a target ranking is available. Specifically, we describe how such problems can be converted into a standard learning-to-rank one on which existing learning solutions can be invoked. This transformation allows us to optimize the aggregating function for any target IR metric, such as Normalized Discounted Cumulative Gain, or Expected Reciprocal Rank. When applied to crowdsourcing and meta-search benchmarks, our new algorithm improves on state-of-the-art preference aggregation methods.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128824694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the inference of average precision from score distributions","authors":"Ronan Cummins","doi":"10.1145/2396761.2398660","DOIUrl":"https://doi.org/10.1145/2396761.2398660","url":null,"abstract":"Modelling the document scores returned from an IR system for a given query using parameterised score distributions is an area of research that has become more popular in recent years. Score distribution (SD) models are useful for a number of IR tasks. These include data fusion, query performance prediction, determining thresholds in filtering applications, and tasks in the area of distributed retrieval. The inference of performance metrics, such as average precision, from these SD models is an important consideration. In this paper, we study the accuracy of a number of methods of inferring average precision from an SD model.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128865562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Karthik Raman, K. Svore, Ran Gilad-Bachrach, C. Burges
{"title":"Learning from mistakes: towards a correctable learning algorithm","authors":"Karthik Raman, K. Svore, Ran Gilad-Bachrach, C. Burges","doi":"10.1145/2396761.2398546","DOIUrl":"https://doi.org/10.1145/2396761.2398546","url":null,"abstract":"Many learning algorithms generate complex models that are difficult for a human to interpret, debug, and extend. In this paper, we address this challenge by proposing a new learning paradigm called correctable learning, where the learning algorithm receives external feedback about which data examples are incorrectly learned. We define a set of metrics which measure the correctability of a learning algorithm. We then propose a simple and efficient correctable learning algorithm which learns local models for different regions of the data space. Given an incorrect example, our method samples data in the neighborhood of that example and learns a new, more correct local model over that region. Experiments over multiple classification and ranking datasets show that our correctable learning algorithm offers significant improvements over the state-of-the-art techniques.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126676281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jaehoon Choi, Donghyeon Kim, Seongsoon Kim, Junkyu Lee, Sangrak Lim, Sunwon Lee, Jaewoo Kang
{"title":"CONSENTO: a new framework for opinion based entity search and summarization","authors":"Jaehoon Choi, Donghyeon Kim, Seongsoon Kim, Junkyu Lee, Sangrak Lim, Sunwon Lee, Jaewoo Kang","doi":"10.1145/2396761.2398547","DOIUrl":"https://doi.org/10.1145/2396761.2398547","url":null,"abstract":"Search engines have become an important decision making tool today. Decision making queries are often subjective, such as \"a good birthday present for my girlfriend\", \"best action movies in 2010\", to name a few. Unfortunately, such queries may not be answered properly by conventional search systems. In order to address this problem, we introduce Consento, a consensus search engine designed to answer subjective queries. Consento performs segment indexing, as opposed to document indexing, to capture semantics from user opinions more precisely. In particular, we define a new indexing unit, Maximal Coherent Semantic Unit (MCSU). An MCSU represents a segment of a document, which captures a single coherent semantic. We also introduce a new ranking method, called ConsensusRank that counts online comments referring to an entity as a weighted vote. In order to validate the efficacy of the proposed framework, we compare Consento with standard retrieval models and their recent extensions for opinion based entity ranking. Experiments using movie and hotel data show the effectiveness of our framework.","PeriodicalId":313414,"journal":{"name":"Proceedings of the 21st ACM international conference on Information and knowledge management","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126493905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}