直接优化学习排序模型的排序方法

Ming Tan, Tian Xia, L. Guo, Shaojun Wang
{"title":"直接优化学习排序模型的排序方法","authors":"Ming Tan, Tian Xia, L. Guo, Shaojun Wang","doi":"10.1145/2487575.2487630","DOIUrl":null,"url":null,"abstract":"We present a novel learning algorithm, DirectRank, which directly and exactly optimizes ranking measures without resorting to any upper bounds or approximations. Our approach is essentially an iterative coordinate ascent method. In each iteration, we choose one coordinate and only update the corresponding parameter, with all others remaining fixed. Since the ranking measure is a stepwise function of a single parameter, we propose a novel line search algorithm that can locate the interval with the best ranking measure along this coordinate quite efficiently. In order to stabilize our system in small datasets, we construct a probabilistic framework for document-query pairs to maximize the likelihood of the objective permutation of top-$\\tau$ documents. This iterative procedure ensures convergence. Furthermore, we integrate regression trees as our weak learners in order to consider the correlation between the different features. Experiments on LETOR datasets and two large datasets, Yahoo challenge data and Microsoft 30K web data, show an improvement over state-of-the-art systems.","PeriodicalId":20472,"journal":{"name":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":"{\"title\":\"Direct optimization of ranking measures for learning to rank models\",\"authors\":\"Ming Tan, Tian Xia, L. Guo, Shaojun Wang\",\"doi\":\"10.1145/2487575.2487630\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a novel learning algorithm, DirectRank, which directly and exactly optimizes ranking measures without resorting to any upper bounds or approximations. Our approach is essentially an iterative coordinate ascent method. In each iteration, we choose one coordinate and only update the corresponding parameter, with all others remaining fixed. Since the ranking measure is a stepwise function of a single parameter, we propose a novel line search algorithm that can locate the interval with the best ranking measure along this coordinate quite efficiently. In order to stabilize our system in small datasets, we construct a probabilistic framework for document-query pairs to maximize the likelihood of the objective permutation of top-$\\\\tau$ documents. This iterative procedure ensures convergence. Furthermore, we integrate regression trees as our weak learners in order to consider the correlation between the different features. Experiments on LETOR datasets and two large datasets, Yahoo challenge data and Microsoft 30K web data, show an improvement over state-of-the-art systems.\",\"PeriodicalId\":20472,\"journal\":{\"name\":\"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"29\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2487575.2487630\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2487575.2487630","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 29

摘要

我们提出了一种新的学习算法,DirectRank,它直接和精确地优化排名措施,而不依赖于任何上界或近似。我们的方法本质上是一种迭代坐标上升法。在每次迭代中,我们选择一个坐标,只更新相应的参数,其他参数保持不变。由于排序测度是单参数的阶跃函数,我们提出了一种新的直线搜索算法,该算法可以沿该坐标非常有效地定位到具有最佳排序测度的区间。为了在小数据集中稳定我们的系统,我们为文档-查询对构建了一个概率框架,以最大化top-$\tau$文档客观排列的可能性。这个迭代过程保证了收敛性。此外,为了考虑不同特征之间的相关性,我们将回归树集成为弱学习器。在LETOR数据集和两个大型数据集(雅虎挑战数据和微软30K网络数据)上的实验表明,与最先进的系统相比,这种方法有所改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Direct optimization of ranking measures for learning to rank models
We present a novel learning algorithm, DirectRank, which directly and exactly optimizes ranking measures without resorting to any upper bounds or approximations. Our approach is essentially an iterative coordinate ascent method. In each iteration, we choose one coordinate and only update the corresponding parameter, with all others remaining fixed. Since the ranking measure is a stepwise function of a single parameter, we propose a novel line search algorithm that can locate the interval with the best ranking measure along this coordinate quite efficiently. In order to stabilize our system in small datasets, we construct a probabilistic framework for document-query pairs to maximize the likelihood of the objective permutation of top-$\tau$ documents. This iterative procedure ensures convergence. Furthermore, we integrate regression trees as our weak learners in order to consider the correlation between the different features. Experiments on LETOR datasets and two large datasets, Yahoo challenge data and Microsoft 30K web data, show an improvement over state-of-the-art systems.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信