Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval最新文献

筛选
英文 中文
To translate or not to translate? 翻译还是不翻译?
Chia-Jung Lee, Chin-Hui Chen, Shao-Hang Kao, Pu-Jen Cheng
{"title":"To translate or not to translate?","authors":"Chia-Jung Lee, Chin-Hui Chen, Shao-Hang Kao, Pu-Jen Cheng","doi":"10.1145/1835449.1835558","DOIUrl":"https://doi.org/10.1145/1835449.1835558","url":null,"abstract":"Query translation is an important task in cross-language information retrieval (CLIR) aiming to translate queries into languages used in documents. The purpose of this paper is to investigate the necessity of translating query terms, which might differ from one term to another. Some untranslated terms cause irreparable performance drop while others do not. We propose an approach to estimate the translation probability of a query term, which helps decide if it should be translated or not. The approach learns regression and classification models based on a rich set of linguistic and statistical properties of the term. Experiments on NTCIR-4 and NTCIR-5 English-Chinese CLIR tasks demonstrate that the proposed approach can significantly improve CLIR performance. An in-depth analysis is also provided for discussing the impact of untranslated out-of-vocabulary (OOV) query terms and translation quality of non-OOV query terms on CLIR performance.","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"615 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123949530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Temporal diversity in recommender systems 推荐系统的时间多样性
N. Lathia, S. Hailes, L. Capra, X. Amatriain
{"title":"Temporal diversity in recommender systems","authors":"N. Lathia, S. Hailes, L. Capra, X. Amatriain","doi":"10.1145/1835449.1835486","DOIUrl":"https://doi.org/10.1145/1835449.1835486","url":null,"abstract":"Collaborative Filtering (CF) algorithms, used to build web-based recommender systems, are often evaluated in terms of how accurately they predict user ratings. However, current evaluation techniques disregard the fact that users continue to rate items over time: the temporal characteristics of the system's top-N recommendations are not investigated. In particular, there is no means of measuring the extent that the same items are being recommended to users over and over again. In this work, we show that temporal diversity is an important facet of recommender systems, by showing how CF data changes over time and performing a user survey. We then evaluate three CF algorithms from the point of view of the diversity in the sequence of recommendation lists they produce over time. We examine how a number of characteristics of user rating patterns (including profile size and time between rating) affect diversity. We then propose and evaluate set methods that maximise temporal recommendation diversity without extensively penalising accuracy.","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125867539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 318
Re-examination on lam% in spam filtering 垃圾邮件过滤中lam%的再检验
Haoliang Qi, Muyun Yang, Xiaoning He, Sheng Li
{"title":"Re-examination on lam% in spam filtering","authors":"Haoliang Qi, Muyun Yang, Xiaoning He, Sheng Li","doi":"10.1145/1835449.1835601","DOIUrl":"https://doi.org/10.1145/1835449.1835601","url":null,"abstract":"Logistic average misclassification percentage (lam%) is a key measure for the spam filtering performance. This paper demonstrates that a spam filter can achieve a perfect 0.00% in lam%, the minimal value in theory, by simply setting a biased threshold during the classifier modeling. At the same time, the overall classification performance reaches only a low accuracy. The result suggests that the role of lam% for spam filtering evaluation should be re-examined.","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127486623","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Geometric representations for multiple documents 多个文档的几何表示
Jangwon Seo, W. Bruce Croft
{"title":"Geometric representations for multiple documents","authors":"Jangwon Seo, W. Bruce Croft","doi":"10.1145/1835449.1835493","DOIUrl":"https://doi.org/10.1145/1835449.1835493","url":null,"abstract":"Combining multiple documents to represent an information object is well-known as an effective approach for many Information Retrieval tasks. For example, passages can be combined to represent a document for retrieval, document clusters are represented using combinations of the documents they contain, and feedback documents can be combined to represent a query model. Various techniques for combination have been introduced, and among them, representation techniques based on concatenation and the arithmetic mean are frequently used. Some recent work has shown the potential of a new representation technique using the geometric mean. However, these studies lack a theoretical foundation explaining why the geometric mean should have advantages for representing multiple documents. In this paper, we show that the arithmetic mean and the geometric mean are approximations to the center of mass in certain geometries, and show empirically that the geometric mean is closer to the center. Through experiments with two IR tasks, we show the potential benefits for geometric representations, including a geometry-based pseudo-relevance feedback method that outperforms state-of-the-art techniques.","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131093026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Session details: Summarization & user feedback 会议细节:总结和用户反馈
E. Liddy
{"title":"Session details: Summarization & user feedback","authors":"E. Liddy","doi":"10.1145/3254382","DOIUrl":"https://doi.org/10.1145/3254382","url":null,"abstract":"","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126826639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Context aware query classification using dynamic query window and relationship net 基于动态查询窗口和关系网的上下文感知查询分类
Nazli Goharian, Saket S. R. Mengle
{"title":"Context aware query classification using dynamic query window and relationship net","authors":"Nazli Goharian, Saket S. R. Mengle","doi":"10.1145/1835449.1835584","DOIUrl":"https://doi.org/10.1145/1835449.1835584","url":null,"abstract":"The context of the user queries, preceding a given query, is utilized to improve the effectiveness of query classification. Earlier efforts utilize fixed number of preceding queries to derive such context information. We propose and evaluate an approach (DQW) that identifies a set of unambiguous preceding queries in a dynamically determined window to utilize in classifying an ambiguous query. Furthermore, utilizing a relationship-net (R-net) that represents relationships among known categories, we improve the classification effectiveness for those ambiguous queries whose predicted category in this relationship-net is related to the category of a query within the window. Our results indicate that the hybrid approach (DQW+R-net) statistically significantly improves the Conditional Random Field (CRF) query classification approach when static query windowing and hierarchical taxonomy are used (SQW+Tax), in terms of precision (10.8%), recall (13.2%), and F1 measure (11.9%).","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121533292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Comparing the sensitivity of information retrieval metrics 比较信息检索指标的敏感性
Filip Radlinski, Nick Craswell
{"title":"Comparing the sensitivity of information retrieval metrics","authors":"Filip Radlinski, Nick Craswell","doi":"10.1145/1835449.1835560","DOIUrl":"https://doi.org/10.1145/1835449.1835560","url":null,"abstract":"Information retrieval effectiveness is usually evaluated using measures such as Normalized Discounted Cumulative Gain (NDCG), Mean Average Precision (MAP) and Precision at some cutoff (Precision@k) on a set of judged queries. Recent research has suggested an alternative, evaluating information retrieval systems based on user behavior. Particularly promising are experiments that interleave two rankings and track user clicks. According to a recent study, interleaving experiments can identify large differences in retrieval effectiveness with much better reliability than other click-based methods. We study interleaving in more detail, comparing it with traditional measures in terms of reliability, sensitivity and agreement. To detect very small differences in retrieval effectiveness, a reliable outcome with standard metrics requires about 5,000 judged queries, and this is about as reliable as interleaving with 50,000 user impressions. Amongst the traditional measures, NDCG has the strongest correlation with interleaving. Finally, we present some new forms of analysis, including an approach to enhance interleaving sensitivity.","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121831496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 111
Session details: Web IR and social media search 会议细节:Web IR和社交媒体搜索
H. Zaragoza
{"title":"Session details: Web IR and social media search","authors":"H. Zaragoza","doi":"10.1145/3254378","DOIUrl":"https://doi.org/10.1145/3254378","url":null,"abstract":"","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124919460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visual summarization of web pages 网页的可视化摘要
Binxing Jiao, Linjun Yang, Jizheng Xu, Feng Wu
{"title":"Visual summarization of web pages","authors":"Binxing Jiao, Linjun Yang, Jizheng Xu, Feng Wu","doi":"10.1145/1835449.1835533","DOIUrl":"https://doi.org/10.1145/1835449.1835533","url":null,"abstract":"Visual summarization is a attractive new scheme to summarize web pages, which can help achieve a more friendly user experience in search and re-finding tasks by allowing users quickly get the idea of what the web page is about and helping users recall the visited web page. In this paper, we perform a careful study on the recently proposed visual summarization approaches, including the thumbnail of the web page snapshot, the internal image in the web page which is representative of the content in the page, and the visual snippet which is a synthesized image based on the internal image, the title, and the logo found in the web page. Moreover, since the internal image based summarization approach hardly works when the representative internal images are unavailable, we propose a new strategy, which retrieves the representative image from the external to summarize the web page. The experimental results suggest that the various summarization approaches have respective advantages on different types of web pages. While internal images and thumbnails can provide a reliable summarization on web pages with dominant images and web pages with simple structure respectively, the external images are regarded as a useful information to complement the internal images and are demonstrated very useful in helping users understanding new web pages . The visual snippet performs well on the re-finding tasks since it incorporates the title and logo which are advantageous on identifying the visited web pages.","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124966706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Using flickr geotags to predict user travel behaviour 使用flickr地理标签来预测用户的旅行行为
M. Clements, P. Serdyukov, A. D. Vries, M. Reinders
{"title":"Using flickr geotags to predict user travel behaviour","authors":"M. Clements, P. Serdyukov, A. D. Vries, M. Reinders","doi":"10.1145/1835449.1835648","DOIUrl":"https://doi.org/10.1145/1835449.1835648","url":null,"abstract":"We propose a method to predict a user's favourite locations in a city, based on his Flickr geotags in other cities. We define a similarity between the geotag distributions of two users based on a Gaussian kernel convolution. The geotags of the most similar users are then combined to rerank the popular locations in the target city personalised for this user. We show that this method can give personalised travel recommendations for users with a clear preference for a specific type of landmark.","PeriodicalId":378368,"journal":{"name":"Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121582056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 112
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信