Proceedings of the 19th ACM international conference on Information and knowledge management最新文献

筛选
英文 中文
Automatic detection of craters in planetary images: an embedded framework using feature selection and boosting 行星图像中陨石坑的自动检测:一个使用特征选择和增强的嵌入式框架
W. Ding, T. Stepinski, L. Bandeira, R. Vilalta, Youxi Wu, Zhenyu Lu, Tianyu Cao
{"title":"Automatic detection of craters in planetary images: an embedded framework using feature selection and boosting","authors":"W. Ding, T. Stepinski, L. Bandeira, R. Vilalta, Youxi Wu, Zhenyu Lu, Tianyu Cao","doi":"10.1145/1871437.1871534","DOIUrl":"https://doi.org/10.1145/1871437.1871534","url":null,"abstract":"Identifying impact craters on planetary surfaces is one fundamental task in planetary science. In this paper, we present an embedded framework on auto-detection of craters, using feature selection and boosting strategies. The paradigm aims at building a universal and practical crater detector. This methodology addresses three issues that such a tool must possess: (i) it utilizes mathematical morphology to efficiently identify the regions of an image that can potentially contain craters; only those regions, defined as crater candidates, are the subjects of further processing; (ii) it selects Haar-like image texture features in combination with boosting ensemble supervised learning algorithms to accurately classify candidates into craters and non-craters; (iii) it uses transfer learning, at a minimum additional cost, to enable maintaining an accurate auto-detection of craters on new images, having morphology different from what has been captured by the original training set. All three aforementioned components of the detection methodology are discussed, and the entire framework is evaluated on a large test image of 37,500 x 56,250$ m2 on Mars, showing heavily cratered Martian terrain characterized by nonuniform surface morphology. Our study demonstrates that this methodology provides a robust and practical tool for planetary science, in terms of both detection accuracy and efficiency.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"90 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121446244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Accelerating probabilistic frequent itemset mining: a model-based approach 加速概率频繁项集挖掘:基于模型的方法
Liang Wang, Reynold Cheng, Sau-dan. Lee, D. Cheung
{"title":"Accelerating probabilistic frequent itemset mining: a model-based approach","authors":"Liang Wang, Reynold Cheng, Sau-dan. Lee, D. Cheung","doi":"10.1145/1871437.1871494","DOIUrl":"https://doi.org/10.1145/1871437.1871494","url":null,"abstract":"Data uncertainty is inherent in emerging applications such as location-based services, sensor monitoring systems, and data integration. To handle a large amount of imprecise information, uncertain databases have been recently developed. In this paper, we study how to efficiently discover frequent itemsets from large uncertain databases, interpreted under the Possible World Semantics. This is technically challenging, since an uncertain database induces an exponential number of possible worlds. To tackle this problem, we propose a novel method to capture the itemset mining process as a Poisson binomial distribution. This model-based approach extracts frequent itemsets with a high degree of accuracy, and supports large databases. We apply our techniques to improve the performance of the algorithms for: (1) finding itemsets whose frequentness probabilities are larger than some threshold; and (2) mining itemsets with the k highest frequentness probabilities. Our approaches support both tuple and attribute uncertainty models, which are commonly used to represent uncertain databases. Extensive evaluation on real and synthetic datasets shows that our methods are highly accurate. Moreover, they are orders of magnitudes faster than previous approaches.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"33 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113933722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 83
Support elements in graph structured schema reintegration 支持图结构模式整合中的元素
Xun Sun, R. Pottinger, Michael K. Lawrence
{"title":"Support elements in graph structured schema reintegration","authors":"Xun Sun, R. Pottinger, Michael K. Lawrence","doi":"10.1145/1871437.1871621","DOIUrl":"https://doi.org/10.1145/1871437.1871621","url":null,"abstract":"Manipulating graph-structured schemas (ontologies, models, etc.) requires the result to remain fully connected. In certain cases, e.g., calculating the difference of two schemas, support structures may be needed in the result. We describe our engine to process support structures in the context of a schema management system and describe schema reintegration experiments which validate the performance and correctness of our system","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130245678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting product review spammers using rating behaviors 使用评级行为检测产品评论垃圾邮件发送者
Ee-Peng Lim, Viet-An Nguyen, Nitin Jindal, B. Liu, Hady W. Lauw
{"title":"Detecting product review spammers using rating behaviors","authors":"Ee-Peng Lim, Viet-An Nguyen, Nitin Jindal, B. Liu, Hady W. Lauw","doi":"10.1145/1871437.1871557","DOIUrl":"https://doi.org/10.1145/1871437.1871557","url":null,"abstract":"This paper aims to detect users generating spam reviews or review spammers. We identify several characteristic behaviors of review spammers and model these behaviors so as to detect the spammers. In particular, we seek to model the following behaviors. First, spammers may target specific products or product groups in order to maximize their impact. Second, they tend to deviate from the other reviewers in their ratings of products. We propose scoring methods to measure the degree of spam for each reviewer and apply them on an Amazon review dataset. We then select a subset of highly suspicious reviewers for further scrutiny by our user evaluators with the help of a web based spammer evaluation software specially developed for user evaluation experiments. Our results show that our proposed ranking and supervised methods are effective in discovering spammers and outperform other baseline method based on helpfulness votes alone. We finally show that the detected spammers have more significant impact on ratings compared with the unhelpful reviewers.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"202 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131839679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 802
Index structures for efficiently searching natural language text 用于有效搜索自然语言文本的索引结构
P. Chubak, Davood Rafiei
{"title":"Index structures for efficiently searching natural language text","authors":"P. Chubak, Davood Rafiei","doi":"10.1145/1871437.1871527","DOIUrl":"https://doi.org/10.1145/1871437.1871527","url":null,"abstract":"Many existing indexes on text work at the document granularity and are not effective in answering the class of queries where the desired answer is only a term or a phrase. In this paper, we study some of the index structures that are capable of answering the class of queries referred to here as wild card queries and perform an analysis of their performance. Our experimental results on a large class of queries from different sources (including query logs and parse trees) and with various datasets reveal some of the performance barriers of these indexes. We then present Word Permuterm Index (WPI) which is an adaptation of the permuterm index for natural language text applications and show that this index supports a wide range of wild card queries, is quick to construct and is highly scalable. Our experimental resultS comparing WPI to alternative methods on a wide range oF wild card queries show a few orders of magnitude performancE improvements for WPI while the memory usage is kept the same for all compared systems.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"171 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129398045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Online update of b-trees 在线更新b树
Marina Barsky, Alex Thomo, Zoltán Tóth, C. Zuzarte
{"title":"Online update of b-trees","authors":"Marina Barsky, Alex Thomo, Zoltán Tóth, C. Zuzarte","doi":"10.1145/1871437.1871460","DOIUrl":"https://doi.org/10.1145/1871437.1871460","url":null,"abstract":"Many scenarios impose a heavy update load on B-tree indexes in modern databases. A typical case is when B-trees are used for indexing all the keywords of a text field. For example upon the insertion of a new text record (e.g. a new document arrives), a barrage of new keywords has to be inserted into the index causing many random disk I/Os and interrupting the normal operation of the database. The common approach has been to collect the updates in a separate structure and then perform a batch update of the index. This update \"freezes\" the database. Many applications, however, require the immediate availability of the new updates without any interruption of the normal database operation. In this paper we present a novel online B-tree update method based on a new buffering data structure we introduce - Dynamic Bucket Tree (DBT). The DBT-buffer serves as a differential index for new updates. The grouping of keys in DBT-buffer is based on the longest common prefixes (LCP) of their binary representations. The LCP is used as a measure of the locality of keys to be transferred to the main B-tree. Our online update system does not slow down concurrent user transactions or lead to degradation of search performance. Experiments confirm that our DBT buffer can be efficiently used for online updates of text fields. As such it represents an effective solution to the notorious problem of handling updates to an Inverted Index.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131064529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Using various term dependencies according to their utilities 根据它们的实用程序使用不同的术语依赖关系
Lixin Shi, Jian-Yun Nie
{"title":"Using various term dependencies according to their utilities","authors":"Lixin Shi, Jian-Yun Nie","doi":"10.1145/1871437.1871655","DOIUrl":"https://doi.org/10.1145/1871437.1871655","url":null,"abstract":"In this paper, we propose a model to integrate term dependencies. Different from previous studies, each pair of terms is assigned a different weight of dependency according to their utility to IR. The experiments show that our model can significantly outperform the previous dependency models using fixed weights.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"88 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131220174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Exploring and visualizing academic social networks 探索和可视化学术社会网络
V. Ganev, Zhaochen Guo, Diego Serrano, Denilson Barbosa, Eleni Stroulia
{"title":"Exploring and visualizing academic social networks","authors":"V. Ganev, Zhaochen Guo, Diego Serrano, Denilson Barbosa, Eleni Stroulia","doi":"10.1145/1871437.1871786","DOIUrl":"https://doi.org/10.1145/1871437.1871786","url":null,"abstract":"We demonstrate the ReaSoN portal, consisting of interactive web-based tools for visualizing, exploring, querying, and integrating academic social networks. We describe how these networks are automatically extracted from bibliographic and citation databases, discuss notions of visibility in such networks which enable a rich set of social network analysis, and demonstrate our novel tools for the visualization and exploration of social networks.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132839912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Selecting keywords for content based recommendation 为内容推荐选择关键字
Christian Wartena, Wout Slakhorst, M. Wibbels
{"title":"Selecting keywords for content based recommendation","authors":"Christian Wartena, Wout Slakhorst, M. Wibbels","doi":"10.1145/1871437.1871665","DOIUrl":"https://doi.org/10.1145/1871437.1871665","url":null,"abstract":"The continued growth of online content makes personalized recommendation an increasingly important tool for media consumption. While collaborative filtering techniques have shown to be very successful in stable collections, content based approaches are necessary for recommending new items. Content based recommendation uses the similarity between new items and consumed items to predict whether a new item is interesting for the user. The similarity is computed by comparing the content or the meta-data of the items. In this paper we consider recommendation of TV-broadcasts for which meta-data and synopses are available. We thereby concentrate on the new item problem. We investigate the value of different types of meta-data provided by the broadcaster or extracted from synopsis. We show that extracted keywords are better suited for recommendation than manually assigned keywords. Furthermore we show that the number of keywords used is of great importance. Using a rather small number of keywords to present an item yields the best results for recommendation.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132871770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
A topical link model for community discovery in textual interaction graph 文本交互图中社区发现的主题链接模型
Guoqing Zheng, Jinwen Guo, Lichun Yang, Shengliang Xu, Shenghua Bao, Zhong Su, Dingyi Han, Yong Yu
{"title":"A topical link model for community discovery in textual interaction graph","authors":"Guoqing Zheng, Jinwen Guo, Lichun Yang, Shengliang Xu, Shenghua Bao, Zhong Su, Dingyi Han, Yong Yu","doi":"10.1145/1871437.1871686","DOIUrl":"https://doi.org/10.1145/1871437.1871686","url":null,"abstract":"This paper is concerned with community discovery in textual interaction graph, where the links between entities are indicated by textual documents. Specifically, we propose a Topical Link Model(TLM), which leverages Hierarchical Dirichlet Process(HDP) to introduce hidden topical variable of the links. Other than the use of links, TLM can look into the documents on the links in detail to recover sound communities. Moreover, TLM is a nonparametric model, which is able to learn the number of communities from the data. Extensive experiments on two real world corpora show TLM outperforms two state-of-the-art baseline models, which verify the effectiveness of TLM in determining the proper number of communities and generating sound communities.","PeriodicalId":310611,"journal":{"name":"Proceedings of the 19th ACM international conference on Information and knowledge management","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132931603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信