Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval最新文献_第9页

Combining document representations for known-item search 结合文档表示进行已知项搜索

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval Pub Date : 2003-07-28 DOI: 10.1145/860435.860463

Paul Ogilvie, Jamie Callan

引用次数: 270

Speech-based and video-supported indexing of multimedia broadcast news 基于语音和视频的多媒体广播新闻索引

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval Pub Date : 2003-07-28 DOI: 10.1145/860435.860541

Yoshihiko Hayashi, K. Ohtsuki, K. Bessho, Osamu Mizuno, Y. Matsuo, S. Matsunaga, Minoru Hayashi, T. Hasegawa, Naruhiro Ikeda

引用次数: 21

A maximal figure-of-merit learning approach to text categorization 文本分类的最大价值图学习方法

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval Pub Date : 2003-07-28 DOI: 10.1145/860435.860469

Sheng Gao, Wen-Chin Wu, Chin-Hui Lee, Tat-Seng Chua

引用次数: 52

Keynote Address - exploring, modeling, and using the web graph 主题演讲-探索、建模和使用网络图形

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval Pub Date : 2003-07-28 DOI: 10.1145/860435.860436

A. Broder

{"title":"Keynote Address - exploring, modeling, and using the web graph","authors":"A. Broder","doi":"10.1145/860435.860436","DOIUrl":"https://doi.org/10.1145/860435.860436","url":null,"abstract":"The Web graph, meaning the graph induced by Web pages as nodes and their hyperlinks as directed edges, has become a fascinating object of study for many people: physicists, sociologists, mathematicians, computer scientists, and information retrieval specialists.Recent results range from theoretical (e.g.: models for the graph, semi-external algorithms), to experimental (e.g.: new insights regarding the rate of change of pages, new data on the distribution of degrees), to practical (e.g.: improvements in crawling technology).Recent results range from theoretical (e.g.: models for the graph, semi-external algorithms), to experimental (e.g.: new insights regarding the rate of change of pages, new data on the distribution of degrees), to practical (e.g.: improvements in crawling technology).The goal of this talk is to convey an introduction to the state of the art in this area and to sketch the current issues in collecting, representing, analyzing, and modeling this graph. Although graph analytic methods are essential tools in the Web IR arsenal, they are well known to the SIGIR community and will not be discussed here in any detail; instead, we will explore some challenges and opportunities for using IR methods and techniques in the exploration of the Web graph, in particular in dealing with legitimate and \"spam\" perturbations of the \"natural\" process of birth and death of nodes and links, and conversely, the challenges and opportunities of using graph methods in support of IR on the Web and in the enterprise.","PeriodicalId":209809,"journal":{"name":"Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124084481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Robustness of regularized linear classification methods in text categorization 正则化线性分类方法在文本分类中的鲁棒性

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval Pub Date : 2003-07-28 DOI: 10.1145/860435.860471

Jian Zhang, Yiming Yang

引用次数: 91

Image classification using hybrid neural networks 基于混合神经网络的图像分类

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval Pub Date : 2003-07-28 DOI: 10.1145/860435.860536

Chih-Fong Tsai, K. McGarry, J. Tait

引用次数: 45

An empirical study on retrieval models for different document genres: patents and newspaper articles 不同文献类型:专利和报纸文章检索模型的实证研究

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval Pub Date : 2003-07-28 DOI: 10.1145/860435.860482

Makoto Iwayama, Atsushi Fujii, N. Kando, Yuzo Marukawa

引用次数: 54

SE-LEGO: creating metasearch engines on demand SE-LEGO:按需创建元搜索引擎

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval Pub Date : 2003-07-28 DOI: 10.1145/860435.860555

Zonghuan Wu, Vijay V. Raghavan, Chun Du, C. KomanduruSai, W. Meng, Hai He, Clement T. Yu

{"title":"SE-LEGO: creating metasearch engines on demand","authors":"Zonghuan Wu, Vijay V. Raghavan, Chun Du, C. KomanduruSai, W. Meng, Hai He, Clement T. Yu","doi":"10.1145/860435.860555","DOIUrl":"https://doi.org/10.1145/860435.860555","url":null,"abstract":"Extended Abstract As a system that provides unified access to multiple existing search systems, a metasearch engine can alleviate ordinary users from the formidable task of identifying useful sources and searching them individually. At present, the largest metasearch engines such as ProFusion (www.profusion.com) and SavvySearch (www.search.com) can connect to about 1,000 search engines. This means that only a small fraction of the information sources on the Web, including both the Surface Web and the Deep Web, are connected, as the number of such sources is estimated to be in the order of hundreds of thousands [1]. Most of these Websites have their own search capabilities and provide search interfaces. Many of these Websites provide high quality information that has been frequently queried by specialists and researchers in particular fields. Present major metasearch engines usually do not connect to these specialized Websites. Currently, building a metasearch engine is an expensive and labor-intensive job that needs diverse expertise. As a result, it is difficult for an ordinary Web user to create a metasearch engine based on the search engines of the user’s choice. Some metasearch engine companies (e.g., ProFusion) allow user to build customized metasearch engines, but only search engines in a pre-compiled list can be used because the capability to connect to these search engines need to be established in advance.","PeriodicalId":209809,"journal":{"name":"Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval","volume":"97 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134086030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Beyond independent relevance: methods and evaluation metrics for subtopic retrieval 超越独立相关性:子主题检索的方法和评价指标

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval Pub Date : 2003-07-28 DOI: 10.1145/860435.860440

ChengXiang Zhai, William W. Cohen, J. Lafferty

{"title":"Beyond independent relevance: methods and evaluation metrics for subtopic retrieval","authors":"ChengXiang Zhai, William W. Cohen, J. Lafferty","doi":"10.1145/860435.860440","DOIUrl":"https://doi.org/10.1145/860435.860440","url":null,"abstract":"We present a non-traditional retrieval problem we call subtopic retrieval. The subtopic retrieval problem is concerned with finding documents that cover many different subtopics of a query topic. In such a problem, the utility of a document in a ranking is dependent on other documents in the ranking, violating the assumption of independent relevance which is assumed in most traditional retrieval methods. Subtopic retrieval poses challenges for evaluating performance, as well as for developing effective algorithms. We propose a framework for evaluating subtopic retrieval which generalizes the traditional precision and recall metrics by accounting for intrinsic topic difficulty as well as redundancy in documents. We propose and systematically evaluate several methods for performing subtopic retrieval using statistical language models and a maximal marginal relevance (MMR) ranking strategy. A mixture model combined with query likelihood relevance ranking is shown to modestly outperform a baseline relevance ranking on a data set used in the TREC interactive track.","PeriodicalId":209809,"journal":{"name":"Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132670085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 512

Re-examining the potential effectiveness of interactive query expansion 重新审视交互式查询扩展的潜在有效性

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval Pub Date : 2003-07-28 DOI: 10.1145/860435.860475

I. Ruthven

引用次数: 222