{"title":"NeuralLoss: A Learnable Pretrained Surrogate Loss for Learning to Rank","authors":"Chen Liu;Cailan Jiang;Lixin Zhou","doi":"10.1109/TKDE.2025.3562450","DOIUrl":null,"url":null,"abstract":"Learning to Rank (LTR) aims to develop a ranking model from supervised data to rank a set of items using machine learning techniques. However, since the losses and ranking metrics involved in LTR are both based on ranking, they are neither continuous nor differentiable, making it challenging to optimize them using gradient descent algorithms. Various surrogate losses have been proposed to address this issue, yet their connection with ranking metrics is often loose, leading to inconsistencies between optimization objectives and evaluation metrics. In this study, we introduce NeuralLoss, a learnable and pretrained surrogate loss. By undergoing training on data structured around ranking metrics, NeuralLoss approximates these ranking metrics, aligning its optimization objectives with evaluation metrics. We employ Transformer to construct the surrogate model and ensure permutation invariance. The pretrained surrogate loss facilitates end-to-end training of ranking models using gradient descent algorithms and can approximate various ranking metrics by adjusting the training data. In this paper, we employ NeuralLoss to approximate NDCG and Recall, demonstrating its performance in both document retrieval and cross-modal retrieval tasks. Experimental results indicate that our approach achieves excellent performance and exhibits strong competitiveness across these tasks.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 7","pages":"4179-4192"},"PeriodicalIF":10.4000,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10969820/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Learning to Rank (LTR) aims to develop a ranking model from supervised data to rank a set of items using machine learning techniques. However, since the losses and ranking metrics involved in LTR are both based on ranking, they are neither continuous nor differentiable, making it challenging to optimize them using gradient descent algorithms. Various surrogate losses have been proposed to address this issue, yet their connection with ranking metrics is often loose, leading to inconsistencies between optimization objectives and evaluation metrics. In this study, we introduce NeuralLoss, a learnable and pretrained surrogate loss. By undergoing training on data structured around ranking metrics, NeuralLoss approximates these ranking metrics, aligning its optimization objectives with evaluation metrics. We employ Transformer to construct the surrogate model and ensure permutation invariance. The pretrained surrogate loss facilitates end-to-end training of ranking models using gradient descent algorithms and can approximate various ranking metrics by adjusting the training data. In this paper, we employ NeuralLoss to approximate NDCG and Recall, demonstrating its performance in both document retrieval and cross-modal retrieval tasks. Experimental results indicate that our approach achieves excellent performance and exhibits strong competitiveness across these tasks.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.