查询级学习排序使用等渗回归

Zhaohui Zheng, H. Zha, Gordon Sun
{"title":"查询级学习排序使用等渗回归","authors":"Zhaohui Zheng, H. Zha, Gordon Sun","doi":"10.1109/ALLERTON.2008.4797684","DOIUrl":null,"url":null,"abstract":"Ranking functions determine the relevance of search results of search engines, and learning ranking functions has become an active research area at the interface between Web search, information retrieval and machine learning. Generally, the training data for learning to rank come in two different forms: (1) absolute relevance judgments assessing the degree of relevance of a document with respect to a query. This type of judgments is also called labeled data and are usually obtained through human editorial efforts; and (2) relative relevance judgments indicating that a document is more relevant than another with respect to a query. This type of judgments is also called preference data and can usually be extracted from the abundantly available user click-through data recording users' interactions with the search results. Most existing learning to rank methods ignore the query boundaries, treating the labeled data or preference data equally across queries. In this paper, we propose a minimum effort optimization method that takes into account the entire training data within a query at each iteration. We tackle this optimization problem using functional iterative methods where the update at each iteration is computed by solving an isotonic regression problem. This more global approach results in faster convergency and signficantly improved performance of the learned ranking functions over existing state-of-the-art methods. We demonstrate the effectiveness of the proposed method using data sets obtained from a commercial search engine as well as publicly available data.","PeriodicalId":120561,"journal":{"name":"2008 46th Annual Allerton Conference on Communication, Control, and Computing","volume":"6 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":"{\"title\":\"Query-level learning to rank using isotonic regression\",\"authors\":\"Zhaohui Zheng, H. Zha, Gordon Sun\",\"doi\":\"10.1109/ALLERTON.2008.4797684\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ranking functions determine the relevance of search results of search engines, and learning ranking functions has become an active research area at the interface between Web search, information retrieval and machine learning. Generally, the training data for learning to rank come in two different forms: (1) absolute relevance judgments assessing the degree of relevance of a document with respect to a query. This type of judgments is also called labeled data and are usually obtained through human editorial efforts; and (2) relative relevance judgments indicating that a document is more relevant than another with respect to a query. This type of judgments is also called preference data and can usually be extracted from the abundantly available user click-through data recording users' interactions with the search results. Most existing learning to rank methods ignore the query boundaries, treating the labeled data or preference data equally across queries. In this paper, we propose a minimum effort optimization method that takes into account the entire training data within a query at each iteration. We tackle this optimization problem using functional iterative methods where the update at each iteration is computed by solving an isotonic regression problem. This more global approach results in faster convergency and signficantly improved performance of the learned ranking functions over existing state-of-the-art methods. We demonstrate the effectiveness of the proposed method using data sets obtained from a commercial search engine as well as publicly available data.\",\"PeriodicalId\":120561,\"journal\":{\"name\":\"2008 46th Annual Allerton Conference on Communication, Control, and Computing\",\"volume\":\"6 4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"28\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 46th Annual Allerton Conference on Communication, Control, and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ALLERTON.2008.4797684\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 46th Annual Allerton Conference on Communication, Control, and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2008.4797684","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28

摘要

排名函数决定了搜索引擎搜索结果的相关性,学习排名函数已经成为Web搜索、信息检索和机器学习之间的一个活跃的研究领域。一般来说,学习排序的训练数据有两种不同的形式:(1)绝对相关性判断,评估文档相对于查询的相关性程度。这种类型的判断也被称为标记数据,通常是通过人类编辑努力获得的;(2)相对相关性判断,表明一份文件比另一份文件在查询方面更相关。这种类型的判断也被称为偏好数据,通常可以从大量可用的用户点击数据中提取,这些数据记录了用户与搜索结果的交互。大多数现有的排序学习方法忽略查询边界,在查询中平等地对待标记数据或首选项数据。在本文中,我们提出了一种最小努力优化方法,该方法在每次迭代时考虑到查询中的整个训练数据。我们使用函数迭代方法解决这个优化问题,其中每次迭代的更新是通过求解等渗回归问题来计算的。与现有的最先进的方法相比,这种更全局的方法可以更快地收敛并显着提高学习到的排名函数的性能。我们使用从商业搜索引擎获得的数据集以及公开可用的数据来证明所提出方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Query-level learning to rank using isotonic regression
Ranking functions determine the relevance of search results of search engines, and learning ranking functions has become an active research area at the interface between Web search, information retrieval and machine learning. Generally, the training data for learning to rank come in two different forms: (1) absolute relevance judgments assessing the degree of relevance of a document with respect to a query. This type of judgments is also called labeled data and are usually obtained through human editorial efforts; and (2) relative relevance judgments indicating that a document is more relevant than another with respect to a query. This type of judgments is also called preference data and can usually be extracted from the abundantly available user click-through data recording users' interactions with the search results. Most existing learning to rank methods ignore the query boundaries, treating the labeled data or preference data equally across queries. In this paper, we propose a minimum effort optimization method that takes into account the entire training data within a query at each iteration. We tackle this optimization problem using functional iterative methods where the update at each iteration is computed by solving an isotonic regression problem. This more global approach results in faster convergency and signficantly improved performance of the learned ranking functions over existing state-of-the-art methods. We demonstrate the effectiveness of the proposed method using data sets obtained from a commercial search engine as well as publicly available data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信