Proceedings of the 18th ACM conference on Information and knowledge management最新文献_第7页

Opinion classification with tree kernel SVM using linguistic modality analysis 基于语言模态分析的树核支持向量机意见分类

Proceedings of the 18th ACM conference on Information and knowledge management Pub Date : 2009-11-02 DOI: 10.1145/1645953.1646231

Takeshi S. Kobayakawa, T. Kumano, Hideki Tanaka, Naoaki Okazaki, Jin-Dong Kim, Junichi Tsujii

引用次数: 9

The influence of the document ranking in expert search 专家搜索中文献排序的影响

Proceedings of the 18th ACM conference on Information and knowledge management Pub Date : 2009-11-02 DOI: 10.1145/1645953.1646282

C. Macdonald, I. Ounis

引用次数: 19

Efficient processing of group-oriented connection queries in a large graph 高效处理大型图中面向组的连接查询

Proceedings of the 18th ACM conference on Information and knowledge management Pub Date : 2009-11-02 DOI: 10.1145/1645953.1646151

James Cheng, Yiping Ke, Wilfred Ng

引用次数: 11

Easiest-first search: towards comprehension-based web search 最简单优先的搜索:走向基于理解的网络搜索

Proceedings of the 18th ACM conference on Information and knowledge management Pub Date : 2009-11-02 DOI: 10.1145/1645953.1646300

Makoto Nakatani, A. Jatowt, Katsumi Tanaka

引用次数: 22

Topic and keyword re-ranking for LDA-based topic modeling 基于lda的主题建模的主题和关键字重排序

Proceedings of the 18th ACM conference on Information and knowledge management Pub Date : 2009-11-02 DOI: 10.1145/1645953.1646223

Yangqiu Song, Shimei Pan, Shixia Liu, Michelle X. Zhou, Weihong Qian

引用次数: 78

Analyzing and evaluating query reformulation strategies in web search logs web搜索日志中查询重构策略的分析与评价

Proceedings of the 18th ACM conference on Information and knowledge management Pub Date : 2009-11-02 DOI: 10.1145/1645953.1645966

Jeff Huang, E. Efthimiadis

{"title":"Analyzing and evaluating query reformulation strategies in web search logs","authors":"Jeff Huang, E. Efthimiadis","doi":"10.1145/1645953.1645966","DOIUrl":"https://doi.org/10.1145/1645953.1645966","url":null,"abstract":"Users frequently modify a previous search query in hope of retrieving better results. These modifications are called query reformulations or query refinements. Existing research has studied how web search engines can propose reformulations, but has given less attention to how people perform query reformulations. In this paper, we aim to better understand how web searchers refine queries and form a theoretical foundation for query reformulation. We study users' reformulation strategies in the context of the AOL query logs. We create a taxonomy of query refinement strategies and build a high precision rule-based classifier to detect each type of reformulation. Effectiveness of reformulations is measured using user click behavior. Most reformulation strategies result in some benefit to the user. Certain strategies like add/remove words, word substitution, acronym expansion, and spelling correction are more likely to cause clicks, especially on higher ranked results. In contrast, users often click the same result as their previous query or select no results when forming acronyms and reordering words. Perhaps the most surprising finding is that some reformulations are better suited to helping users when the current results are already fruitful, while other reformulations are more effective when the results are lacking. Our findings inform the design of applications that can assist searchers; examples are described in this paper.","PeriodicalId":286251,"journal":{"name":"Proceedings of the 18th ACM conference on Information and knowledge management","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121638761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 300

Subspace maximum margin clustering 子空间最大边际聚类

Proceedings of the 18th ACM conference on Information and knowledge management Pub Date : 2009-11-02 DOI: 10.1145/1645953.1646122

Quanquan Gu, Jie Zhou

{"title":"Subspace maximum margin clustering","authors":"Quanquan Gu, Jie Zhou","doi":"10.1145/1645953.1646122","DOIUrl":"https://doi.org/10.1145/1645953.1646122","url":null,"abstract":"In text mining, we are often confronted with very high dimensional data. Clustering with high dimensional data is a challenging problem due to the curse of dimensionality. In this paper, to address this problem, we propose an subspace maximum margin clustering (SMMC) method, which performs dimensionality reduction and maximum margin clustering simultaneously within a unified framework. We aim to learn a subspace, in which we try to find a cluster assignment of the data points, together with a hyperplane classifier, such that the resultant margin is maximized among all possible cluster assignments and all possible subspaces. The original problem is transformed from learning the subspace to learning a positive semi-definite matrix, in order to avoid tuning the dimensionality of the subspace. The transformed problem can be solved efficiently via cutting plane technique and constrained concave-convex procedure (CCCP). Since the sub-problem in each iteration of CCCP is joint convex, alternating minimization is adopted to obtain the global optimum. Experiments on benchmark data sets illustrate that the proposed method outperforms the state of the art clustering methods as well as many dimensionality reduction based clustering approaches.","PeriodicalId":286251,"journal":{"name":"Proceedings of the 18th ACM conference on Information and knowledge management","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133184421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

A collaborative filtering approach to ad recommendation using the query-ad click graph 一种使用查询-广告点击图的广告推荐协同过滤方法

Proceedings of the 18th ACM conference on Information and knowledge management Pub Date : 2009-11-02 DOI: 10.1145/1645953.1646267

T. Anastasakos, D. Hillard, Sanjay Kshetramade, Hema Raghavan

引用次数: 37

Exploiting internal and external semantics for the clustering of short texts using world knowledge 利用世界知识对短文本进行内部和外部语义聚类

Proceedings of the 18th ACM conference on Information and knowledge management Pub Date : 2009-11-02 DOI: 10.1145/1645953.1646071

Xia Hu, Nan Sun, Chao Zhang, Tat-Seng Chua

引用次数: 271

Similarity-aware indexing for real-time entity resolution 用于实时实体解析的相似度感知索引

Proceedings of the 18th ACM conference on Information and knowledge management Pub Date : 2009-11-02 DOI: 10.1145/1645953.1646173

P. Christen, Ross W. Gayler, D. Hawking

{"title":"Similarity-aware indexing for real-time entity resolution","authors":"P. Christen, Ross W. Gayler, D. Hawking","doi":"10.1145/1645953.1646173","DOIUrl":"https://doi.org/10.1145/1645953.1646173","url":null,"abstract":"Entity resolution, also known as data matching or record linkage, is the task of identifying and matching records from several databases that refer to the same entities. Traditionally, entity resolution has been applied in batch-mode and on static databases. However, many organisations are increasingly faced with the challenge of having large databases containing entities that need to be matched in real-time with a stream of query records also containing entities, such that the best matching records are retrieved. Example applications include online law enforcement and national security databases, public health surveillance and emergency response systems, financial verification systems, online retail stores, eGovernment services, and digital libraries. A novel inverted index based approach for real-time entity resolution is presented in this paper. At build time, similarities between attribute values are computed and stored to support the fast matching of records at query time. The presented approach differs from other approaches to approximate query matching in that it allows any similarity comparison function, and any 'blocking' (encoding) function, both possibly domain specific, to be incorporated. Experimental results on a real-world database indicate that the total size of all data structures of this novel index approach grows sub-linearly with the size of the database, and that it allows matching of query records in sub-second time, more than two orders of magnitude faster than a traditional entity resolution index approach. The interested reader is referred to the longer version of this paper [5].","PeriodicalId":286251,"journal":{"name":"Proceedings of the 18th ACM conference on Information and knowledge management","volume":"353 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134228425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 51