Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval最新文献

筛选
英文 中文
Session details: Applications II 会话细节:应用程序
David D. Lewis
{"title":"Session details: Applications II","authors":"David D. Lewis","doi":"10.1145/3254389","DOIUrl":"https://doi.org/10.1145/3254389","url":null,"abstract":"","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123927447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Query recovery of short user queries: on query expansion with stopwords 用户短查询的查询恢复:对带有停止词的查询扩展
Johannes Leveling, G. Jones
{"title":"Query recovery of short user queries: on query expansion with stopwords","authors":"Johannes Leveling, G. Jones","doi":"10.1145/1835449.1835589","DOIUrl":"https://doi.org/10.1145/1835449.1835589","url":null,"abstract":"User queries to search engines are observed to predominantly contain inflected content words but lack stopwords and capitalization. Thus, they often resemble natural language queries after case folding and stopword removal. Query recovery aims to generate a linguistically well-formed query from a given user query as input to provide natural language processing tasks and cross-language information retrieval (CLIR). The evaluation of query translation shows that translation scores (NIST and BLEU) decrease after case folding, stopword removal, and stemming. A baseline method for query recovery reconstructs capitalization and stopwords, which considerably increases translation scores and significantly increases mean average precision for a standard CLIR task.","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"54 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116333452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Proximity-based opinion retrieval 基于接近度的意见检索
Shima Gerani, Mark James Carman, F. Crestani
{"title":"Proximity-based opinion retrieval","authors":"Shima Gerani, Mark James Carman, F. Crestani","doi":"10.1145/1835449.1835517","DOIUrl":"https://doi.org/10.1145/1835449.1835517","url":null,"abstract":"Blog post opinion retrieval aims at finding blog posts that are relevant and opinionated about a user's query. In this paper we propose a simple probabilistic model for assigning relevant opinion scores to documents. The key problem is how to capture opinion expressions in the document, that are related to the query topic. Current solutions enrich general opinion lexicons by finding query-specific opinion lexicons using pseudo-relevance feedback on external corpora or the collection itself. In this paper we use a general opinion lexicon and propose using proximity information in order to capture opinion term relatedness to the query. We propose a proximity-based opinion propagation method to calculate the opinion density at each point in a document. The opinion density at the position of a query term in the document can then be considered as the probability of opinion about the query term at that position. The effect of different kernels for capturing the proximity is also discussed. Experimental results on the BLOG06 dataset show that the proposed method provides significant improvement over standard TREC baselines and achieves a 2.5% increase in MAP over the best performing run in the TREC 2008 blog track.","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125760938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 73
User centered story tracking 以用户为中心跟踪故事
Ilija Subasic
{"title":"User centered story tracking","authors":"Ilija Subasic","doi":"10.1145/1835449.1835693","DOIUrl":"https://doi.org/10.1145/1835449.1835693","url":null,"abstract":"Using data collections available on the Internet has for many people became the main medium for staying informed about the world. Many of these collections are in nature dynamic, evolving as the subjects they describe change. The goal of different research areas is to identify and highlight these changes to better enable readers to track stories. In this work we restrict ourselves to news collections and investigate \"real-life\" effectiveness and usability of temporal text mining (TTM) story tracking methods. We propose a new story tracking method and build a tool to support it. Additionally, we investigate the effectiveness and usability of story tracking methods and define a new frameworks for automatic and user oriented evaluation. We built methods and tools which allow for understanding, discovery, and search through user interaction. Although there are many TTM methods developed there is a lack of common evaluation procedure. Therefore, we propose an evaluation framework for measuring how different TTM methods discover novel \"facts\". Apart from the automatic evaluation we are interested in how can users interact with pattens and learn about the underlying subjects of the story they track. For this purpose we propose a user testing environment that measures speed and accuracy in which users can use story tracking methods to discover predefined sets of ground-truth sentences.","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122239503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Clustering II 会话细节:集群II
Omar Alonso
{"title":"Session details: Clustering II","authors":"Omar Alonso","doi":"10.1145/3254371","DOIUrl":"https://doi.org/10.1145/3254371","url":null,"abstract":"","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134598810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A joint probabilistic classification model for resource selection 资源选择的联合概率分类模型
Dzung Hong, Luo Si, Paul J. Bracke, M. Witt, Timothy C Juchcinski
{"title":"A joint probabilistic classification model for resource selection","authors":"Dzung Hong, Luo Si, Paul J. Bracke, M. Witt, Timothy C Juchcinski","doi":"10.1145/1835449.1835468","DOIUrl":"https://doi.org/10.1145/1835449.1835468","url":null,"abstract":"Resource selection is an important task in Federated Search to select a small number of most relevant information sources. Current resource selection algorithms such as GlOSS, CORI, ReDDE, Geometric Average and the recent classification-based method focus on the evidence of individual information sources to determine the relevance of available sources. Current algorithms do not model the important relationship information among individual sources. For example, an information source tends to be relevant to a user query if it is similar to another source with high probability of being relevant. This paper proposes a joint probabilistic classification model for resource selection. The model estimates the probability of relevance of information sources in a joint manner by considering both the evidence of individual sources and their relationship. An extensive set of experiments have been conducted on several datasets to demonstrate the advantage of the proposed model.","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"29 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131894862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Robust audio identification for MP3 popular music MP3流行音乐的鲁棒音频识别
Wei Li, Yaduo Liu, X. Xue
{"title":"Robust audio identification for MP3 popular music","authors":"Wei Li, Yaduo Liu, X. Xue","doi":"10.1145/1835449.1835554","DOIUrl":"https://doi.org/10.1145/1835449.1835554","url":null,"abstract":"Audio identification via fingerprint has been an active research field with wide applications for years. Many technical papers were published and commercial software systems were also employed. However, most of these previously reported methods work on the raw audio format in spite of the fact that nowadays compressed format audio, especially MP3 music, has grown into the dominant way to store on personal computers and transmit on the Internet. It would be interesting if a compressed unknown audio fragment is able to be directly recognized from the database without the fussy and time-consuming decompression-identification-recompression procedure. So far, very few algorithms run directly in the compressed domain for music information retrieval, and most of them take advantage of MDCT coefficients or derived energy type of features. As a first attempt, we propose in this paper utilizing compressed-domain spectral entropy as the audio feature to implement a novel audio fingerprinting algorithm. The compressed songs stored in a music database and the possibly distorted compressed query excerpts are first partially decompressed to obtain the MDCT coefficients as the intermediate result. Then by grouping granules into longer blocks, remapping the MDCT coefficients into 192 new frequency lines to unify the frequency distribution of long and short windows, and defining 9 new subbands which cover the main frequency bandwidth of popular songs in accordance with the scale-factor bands of short windows, we calculate the spectral entropy of all consecutive blocks and come to the final fingerprint sequence by means of magnitude relationship modeling. Experiments show that such fingerprints exhibit strong robustness against various audio signal distortions like recompression, noise interference, echo addition, equalization, band-pass filtering, pitch shifting, and slight time-scale modification etc. For 5s-long query examples which might be severely degraded, an average top-five retrieval precision rate of more than 90% can be obtained in our test data set composed of 1822 popular songs.","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132776262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Closed form solution of similarity algorithms 相似算法的封闭形式解
Yuanzhe Cai, Miao Zhang, C. Ding, Sharma Chakravarthy
{"title":"Closed form solution of similarity algorithms","authors":"Yuanzhe Cai, Miao Zhang, C. Ding, Sharma Chakravarthy","doi":"10.1145/1835449.1835577","DOIUrl":"https://doi.org/10.1145/1835449.1835577","url":null,"abstract":"Algorithms defining similarities between objects of an information network are important of many IR tasks. SimRank algorithm and its variations are popularly used in many applications. Many fast algorithms are also developed. In this note, we first reformulate them as random walks on the network and express them using forward and backward transition probably in a matrix form. Second, we show that P-Rank (SimRank is only the special case of P-Rank) has a unique solution of eeT when decay factor c is equal to 1. We also show that SimFusion algorithm is a special case of P-Rank algorithm and prove that the similarity matrix of SimFusion is the product of PageRank vector. Our experiments on the web datasets show that for P-Rank the decay factor c doesn't seriously affect the similarity accuracy and accuracy of P-Rank is also higher than SimFusion and SimRank.","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133258316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Exploring reductions for long web queries 探索减少长网页查询
Niranjan Balasubramanian, G. Kumaran, Vitor R. Carvalho
{"title":"Exploring reductions for long web queries","authors":"Niranjan Balasubramanian, G. Kumaran, Vitor R. Carvalho","doi":"10.1145/1835449.1835545","DOIUrl":"https://doi.org/10.1145/1835449.1835545","url":null,"abstract":"Long queries form a difficult, but increasingly important segment for web search engines. Query reduction, a technique for dropping unnecessary query terms from long queries, improves performance of ad-hoc retrieval on TREC collections. Also, it has great potential for improving long web queries (upto 25% improvement in NDCG@5). However, query reduction on the web is hampered by the lack of accurate query performance predictors and the constraints imposed by search engine architectures and ranking algorithms. In this paper, we present query reduction techniques for long web queries that leverage effective and efficient query performance predictors. We propose three learning formulations that combine these predictors to perform automatic query reduction. These formulations enable trading of average improvements for the number of queries impacted, and enable easy integration into the search engine's architecture for rank-time query reduction. Experiments on a large collection of long queries issued to a commercial search engine show that the proposed techniques significantly outperform baselines, with more than 12% improvement in NDCG@5 in the impacted set of queries. Extension to the formulations such as result interleaving further improves results. We find that the proposed techniques deliver consistent retrieval gains where it matters most: poorly performing long web queries.","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"374 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115990497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 89
Session details: Applications I 会话细节:应用程序
Luo Si
{"title":"Session details: Applications I","authors":"Luo Si","doi":"10.1145/3254367","DOIUrl":"https://doi.org/10.1145/3254367","url":null,"abstract":"","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116852398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信