Proceedings of the 22nd ACM international conference on Information & Knowledge Management最新文献

筛选
英文 中文
Spatial search for K diverse-near neighbors K个异近邻的空间搜索
Gregory Ference, Wang-Chien Lee, Hui-Ju Hung, De-Nian Yang
{"title":"Spatial search for K diverse-near neighbors","authors":"Gregory Ference, Wang-Chien Lee, Hui-Ju Hung, De-Nian Yang","doi":"10.1145/2505515.2505747","DOIUrl":"https://doi.org/10.1145/2505515.2505747","url":null,"abstract":"To many location-based service applications that prefer diverse results, finding locations that are spatially diverse and close in proximity to a query point (e.g., the current location of a user) can be more useful than finding the k nearest neighbors/locations. In this paper, we investigate the problem of searching for the k Diverse-Near Neighbors (kDNNs)} in spatial space that is based upon the spatial diversity and proximity of candidate locations to the query point. While employing a conventional distance measure for proximity, we develop a new and intuitive diversity metric based upon the variance of the angles among the candidate locations with respect to the query point. Accordingly, we create a dynamic programming algorithm that finds the optimal kDNNs. Unfortunately, the dynamic programming algorithm, with a time complexity of O(kn3), incurs excessive computational cost. Therefore, we further propose two heuristic algorithms, namely, Distance-based Browsing (DistBrow) and Diversity-based Browsing (DivBrow) that provide high effectiveness while being efficient by exploring the search space prioritized upon the proximity to the query point and spatial diversity, respectively. Using real and synthetic datasets, we conduct a comprehensive performance evaluation. The results show that DistBrow and DivBrow have superior effectiveness compared to state-of-the-art algorithms while maintaining high efficiency.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"79 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73891715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Entropy-based histograms for selectivity estimation 基于熵的直方图的选择性估计
Hien To, Kuorong Chiang, C. Shahabi
{"title":"Entropy-based histograms for selectivity estimation","authors":"Hien To, Kuorong Chiang, C. Shahabi","doi":"10.1145/2505515.2505756","DOIUrl":"https://doi.org/10.1145/2505515.2505756","url":null,"abstract":"Histograms have been extensively used for selectivity estimation by academics and have successfully been adopted by database industry. However, the estimation error is usually large for skewed distributions and biased attributes, which are typical in real-world data. Therefore, we propose effective models to quantitatively measure bias and selectivity based on information entropy. These models together with the principles of maximum entropy are then used to develop a class of entropy-based histograms. Moreover, since entropy can be computed incrementally, we present the incremental variations of our algorithms that reduce the complexities of the histogram construction from quadratic to linear. We conducted an extensive set of experiments with both synthetic and real-world datasets to compare the accuracy and efficiency of our proposed techniques with many other histogram-based techniques, showing the superiority of the entropy-based approaches for both equality and range queries.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"584 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75238299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Exploring XML data is as easy as using maps 探索XML数据就像使用地图一样简单
Yong Zeng, Z. Bao, Guoliang Li, T. Ling
{"title":"Exploring XML data is as easy as using maps","authors":"Yong Zeng, Z. Bao, Guoliang Li, T. Ling","doi":"10.1145/2505515.2508201","DOIUrl":"https://doi.org/10.1145/2505515.2508201","url":null,"abstract":"For keyword search on XML data, traditionally, a list of query results in the form of subtrees will be returned to users. However, we find that it is still not sufficient to meet users' information needs because: (1) the search intention of a certain keyword query varies from person to person; (2) amongst the query results, they may have sibling or containment relationships (in the context of whole XML database), which could be important for users to digest the query results and should be shown to users. Therefore, we try to equip the traditional XML keyword search engine with our new exploration model XMAP, providing user an interactive yet novel way to explore the results with better user experience.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"17 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72563739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying salient entities in web pages 识别网页中的显著实体
Michael Gamon, T. Yano, Xinying Song, Johnson Apacible, Patrick Pantel
{"title":"Identifying salient entities in web pages","authors":"Michael Gamon, T. Yano, Xinying Song, Johnson Apacible, Patrick Pantel","doi":"10.1145/2505515.2505602","DOIUrl":"https://doi.org/10.1145/2505515.2505602","url":null,"abstract":"We propose a system that determines the salience of entities within web documents. Many recent advances in commercial search engines leverage the identification of entities in web pages. However, for many pages, only a small subset of entities are central to the document, which can lead to degraded relevance for entity triggered experiences. We address this problem by devising a system that scores each entity on a web page according to its centrality to the page content. We propose salience classification functions that incorporate various cues from document content, web search logs, and a large web graph. To cost-effectively train the models, we introduce a soft labeling methodology that generates a set of annotations based on user behaviors observed in web search logs. We evaluate several variations of our model via a large-scale empirical study conducted over a test set, which we release publicly to the research community. We demonstrate that our methods significantly outperform competitive baselines and the previous state of the art, while keeping the human annotation cost to a minimum.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78225908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
QBEES: query by entity examples QBEES:按实体样例查询
S. Metzger, Ralf Schenkel, M. Sydow
{"title":"QBEES: query by entity examples","authors":"S. Metzger, Ralf Schenkel, M. Sydow","doi":"10.1145/2505515.2507873","DOIUrl":"https://doi.org/10.1145/2505515.2507873","url":null,"abstract":"Structured knowledge bases are an increasingly important way for storing and retrieving information. Within such knowledge bases, an important search task is finding similar entities based on one or more example entities. We present QBEES, a novel framework for defining entity similarity based only on structural features, so-called aspects, of the entities, that includes query-dependent and query-independent entity ranking components. We present evaluation results with a number of existing entity list completion benchmarks, comparing to several state-of-the-art baselines.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"51 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75892252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Trustable aggregation of online ratings 可靠的在线评级汇总
Hyun-Kyo Oh, Sang-Wook Kim, Sunju Park, M. Zhou
{"title":"Trustable aggregation of online ratings","authors":"Hyun-Kyo Oh, Sang-Wook Kim, Sunju Park, M. Zhou","doi":"10.1145/2505515.2507863","DOIUrl":"https://doi.org/10.1145/2505515.2507863","url":null,"abstract":"The average of the customer ratings on the product, which we call reputation, is one of the key factors in online purchasing decision of a product. There is, however, no guarantee in the trustworthiness of the reputation since it can be manipulated rather easily. In this paper, we define false reputation as the problem of the reputation to be manipulated by unfair ratings, and design a general framework that provides trustable reputation. For this purpose, we propose TRUEREPUTATION, an algorithm that iteratively adjusts the reputation based on the confidence of customer ratings.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"64 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74962508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Random walk-based graphical sampling in unbalanced heterogeneous bipartite social graphs 非平衡异构二部社会图中基于随机行走的图形抽样
Yusheng Xie, Zhengzhang Chen, Ankit Agrawal, A. Choudhary, Lu Liu
{"title":"Random walk-based graphical sampling in unbalanced heterogeneous bipartite social graphs","authors":"Yusheng Xie, Zhengzhang Chen, Ankit Agrawal, A. Choudhary, Lu Liu","doi":"10.1145/2505515.2507822","DOIUrl":"https://doi.org/10.1145/2505515.2507822","url":null,"abstract":"We investigate sampling techniques in unbalanced heterogeneous bipartite graphs (UHBGs), which have wide applications in real world web-scale social networks. We propose random walked-based link sampling and stratified sampling for UHBGs and show that they have advantages over generic random walk samplers. In addition, each sampler's node degree distribution parameter estimator statistic is analytically derived to be used as a quality indicator. In the experiments, we apply the two sampling techniques, with a baseline node sampling method, to both synthetic and real Facebook data. The experimental results show that random walk-based stratified sampler has significant advantage over node sampler and link sampler on UHBGs.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"58 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80145395","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Nonparametric bayesian multitask collaborative filtering 非参数贝叶斯多任务协同过滤
S. Chatzis
{"title":"Nonparametric bayesian multitask collaborative filtering","authors":"S. Chatzis","doi":"10.1145/2505515.2505517","DOIUrl":"https://doi.org/10.1145/2505515.2505517","url":null,"abstract":"The dramatic rates new digital content becomes available has brought collaborative filtering systems to the epicenter of computer science research in the last decade. One of the greatest challenges collaborative filtering systems are confronted with is the data sparsity problem: users typically rate only very few items; thus, availability of historical data is not adequate to effectively perform prediction. To alleviate these issues, in this paper we propose a novel multitask collaborative filtering approach. Our approach is based on a coupled latent factor model of the users rating functions, which allows for coming up with an agile information sharing mechanism that extracts much richer task-correlation information compared to existing approaches. Formulation of our method is based on concepts from the field of Bayesian nonparametrics, specifically Indian Buffet Process priors, which allow for data-driven determination of the optimal number of underlying latent features (item characteristics and user traits) assumed in the context of the model. We experiment on several real-world datasets, demonstrating both the efficacy of our method, and its superiority over existing approaches.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"19 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80197138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Domain-dependent/independent topic switching model for online reviews with numerical ratings 带有数字评级的在线评论的领域依赖/独立主题切换模型
Yasutoshi Ida, Takuma Nakamura, Takashi Matsumoto
{"title":"Domain-dependent/independent topic switching model for online reviews with numerical ratings","authors":"Yasutoshi Ida, Takuma Nakamura, Takashi Matsumoto","doi":"10.1145/2505515.2505540","DOIUrl":"https://doi.org/10.1145/2505515.2505540","url":null,"abstract":"We propose a domain-dependent/independent topic switching model based on Bayesian probabilistic modeling for modeling online product reviews that are accompanied with numerical ratings provided by users. In this model, each word is allocated to a domain-dependent topic or a domain-independent topic, and the distribution of topics in an online review is connected to an observed numerical rating via a linear regression model. Domain-dependent topics utilize domain information observed with a corpus, and domain-independent topics utilize the framework of Bayesian Nonparametrics, which can estimate the number of topics in posterior distributions. The posterior distribution is estimated via collapsed Gibbs sampling. Using real data, our proposed model had smaller mean square error and smaller average mean error with a small model size and achieved convergence in fewer iterations for a regression task involving online review ratings, outperforming a baseline model that did not consider domains. Moreover, the proposed model can also tell us whether the words are positive or negative in the form of continuous values. This feature allows us to extract domain-dependent and -independent sentiment words.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"29 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80391731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Automated snippet generation for online advertising 自动片段生成在线广告
Stamatina Thomaidou, Ismini Lourentzou, Panagiotis Katsivelis-Perakis, M. Vazirgiannis
{"title":"Automated snippet generation for online advertising","authors":"Stamatina Thomaidou, Ismini Lourentzou, Panagiotis Katsivelis-Perakis, M. Vazirgiannis","doi":"10.1145/2505515.2507876","DOIUrl":"https://doi.org/10.1145/2505515.2507876","url":null,"abstract":"Products, services or brands can be advertised alongside the search results in major search engines, while recently smaller displays on devices like tablets and smartphones have imposed the need for smaller ad texts. In this paper, we propose a method that produces in an automated manner compact text ads (promotional text snippets), given as input a product description webpage (landing page). The challenge is to produce a small comprehensive ad while maintaining at the same time relevance, clarity, and attractiveness. Our method includes the following phases. Initially, it extracts relevant and important n-grams (keywords) given the landing page. The keywords reserved must have a positive meaning in order to have a call-to-action style, thus we attempt sentiment analysis on them. Next, we build an Advertising Language Model to evaluate phrases in terms of their marketing appeal. We experiment with two variations of our method and we show that they outperform all the baseline approaches.","PeriodicalId":20528,"journal":{"name":"Proceedings of the 22nd ACM international conference on Information & Knowledge Management","volume":"151 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2013-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80518137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信