Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval最新文献_第4页

Graph-based text classification: learn from your neighbors 基于图的文本分类:向你的邻居学习

Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2006-08-06 DOI: 10.1145/1148170.1148254

Ralitsa Angelova, G. Weikum

{"title":"Graph-based text classification: learn from your neighbors","authors":"Ralitsa Angelova, G. Weikum","doi":"10.1145/1148170.1148254","DOIUrl":"https://doi.org/10.1145/1148170.1148254","url":null,"abstract":"Automatic classification of data items, based on training samples, can be boosted by considering the neighborhood of data items in a graph structure (e.g., neighboring documents in a hyperlink environment or co-authors and their publications for bibliographic data entries). This paper presents a new method for graph-based classification, with particular emphasis on hyperlinked text documents but broader applicability. Our approach is based on iterative relaxation labeling and can be combined with either Bayesian or SVM classifiers on the feature spaces of the given data items. The graph neighborhood is taken into consideration to exploit locality patterns while at the same time avoiding overfitting. In contrast to prior work along these lines, our approach employs a number of novel techniques: dynamically inferring the link/class pattern in the graph in the run of the iterative relaxation labeling, judicious pruning of edges from the neighborhood graph based on node dissimilarities and node degrees, weighting the influence of edges based on a distance metric between the classification labels of interest and weighting edges by content similarity measures. Our techniques considerably improve the robustness and accuracy of the classification outcome, as shown in systematic experimental comparisons with previously published methods on three different real-world datasets.","PeriodicalId":433366,"journal":{"name":"Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval","volume":"181 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131883780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 188

Load balancing for term-distributed parallel retrieval 项分布并行检索的负载平衡

Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2006-08-06 DOI: 10.1145/1148170.1148232

Alistair Moffat, William Webber, J. Zobel

引用次数: 98

Learning to advertise 学习做广告

Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2006-08-06 DOI: 10.1145/1148170.1148265

A. Lacerda, Marco Cristo, Marcos André Gonçalves, Weiguo Fan, N. Ziviani, B. Ribeiro-Neto

{"title":"Learning to advertise","authors":"A. Lacerda, Marco Cristo, Marcos André Gonçalves, Weiguo Fan, N. Ziviani, B. Ribeiro-Neto","doi":"10.1145/1148170.1148265","DOIUrl":"https://doi.org/10.1145/1148170.1148265","url":null,"abstract":"Content-targeted advertising, the task of automatically associating ads to a Web page, constitutes a key Web monetization strategy nowadays. Further, it introduces new challenging technical problems and raises interesting questions. For instance, how to design ranking functions able to satisfy conflicting goals such as selecting advertisements (ads) that are relevant to the users and suitable and profitable to the publishers and advertisers? In this paper we propose a new framework for associating ads with web pages based on Genetic Programming (GP). Our GP method aims at learning functions that select the most appropriate ads, given the contents of a Web page. These ranking functions are designed to optimize overall precision and minimize the number of misplacements. By using a real ad collection and web pages from a newspaper, we obtained a gain over a state-of-the-art baseline method of 61.7% in average precision. Further, by evolving individuals to provide good ranking estimations, GP was able to discover ranking functions that are very effective in placing ads in web pages while avoiding irrelevant ones.","PeriodicalId":433366,"journal":{"name":"Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval","volume":"223 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134455952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 167

A method of rating the credibility of news documents on the web 一种评价网络上新闻文件可信度的方法

Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2006-08-06 DOI: 10.1145/1148170.1148316

Ryosuke Nagura, Yohei Seki, N. Kando, Masaki Aono

引用次数: 21

One-sided measures for evaluating ranked retrieval effectiveness with spontaneous conversational speech 评价自发性会话语音排序检索效果的单侧测量方法

Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2006-08-06 DOI: 10.1145/1148170.1148311

Baolong Liu, Douglas W. Oard

引用次数: 32

Unity: relevance feedback using user query logs 统一:通过用户查询日志进行相关性反馈

Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2006-08-06 DOI: 10.1145/1148170.1148319

J. Parikh, S. Kapur

引用次数: 12

Concept-based biomedical text retrieval 基于概念的生物医学文本检索

Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2006-08-06 DOI: 10.1145/1148170.1148336

Ming Zhong, Xiangji Huang

引用次数: 28

An analysis of the coupling between training set and neighborhood sizes for the kNN classifier kNN分类器训练集与邻域大小耦合分析

Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2006-08-06 DOI: 10.1145/1148170.1148317

J. S. Olsson

引用次数: 9

Information retrieval at Boeing: plans and successes 波音公司的信息检索:计划和成功

Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2006-08-06 DOI: 10.1145/1148170.1148173

R. Radhakrishnan

引用次数: 6

Evaluating evaluation metrics based on the bootstrap 评估基于自举的评估指标

Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval Pub Date : 2006-08-06 DOI: 10.1145/1148170.1148261

T. Sakai

引用次数: 227