GSRDR-GAN：基于多头自关注和GAN的全局搜索结果多样化排序方法

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neurocomputing Pub Date : 2025-06-17 DOI:10.1016/j.neucom.2025.130723

Weidong Liu , Jinzhong Li , Shengbo Chen

{"title":"GSRDR-GAN：基于多头自关注和GAN的全局搜索结果多样化排序方法","authors":"Weidong Liu , Jinzhong Li , Shengbo Chen","doi":"10.1016/j.neucom.2025.130723","DOIUrl":null,"url":null,"abstract":"<div><div>Search result diversification ranking aims to generate rankings that comprehensively cover multiple subtopics, but existing methods often struggle to balance ranking diversity with relevance and face challenges in modeling document interactions and dealing with limited high-quality training data. While GAN have proven highly successful in fields like computer vision, their application to search result diversification has been limited due to the discrete nature of ranking items and the complex interactions among documents. To address these challenges, we propose GSRDR-GAN, a novel approach that integrates multi-head self-attention with GAN. Our method consists of four key components designed to address the limitations of traditional approaches: the Selected Document State Retriever, the Subtopic Encoder with Multi-head Self-Attention, the Subtopic Decoder with Multi-head Self-Attention, and the Relevance Predictor. First, a self-attention-based feature extraction module is employed to enhance document representations, enabling the model to capture both global and local context effectively. Second, a GAN framework is introduced to improve generalization by generating diverse rankings, mitigating limited high-quality training data. Third, a carefully designed reward function optimizes the trade-off between ranking diversity and relevance, allowing the model to adaptively prioritize these competing objectives during training. Notably, the method improves the generator’s stability and the diversity of search results by reducing training variance, even without pre-trained models. Extensive experiments on the TREC Web Track dataset demonstrate that the proposed GSRDR-GAN method significantly enhances result diversity, achieving relative improvements of 1.7% in <span><math><mi>α</mi></math></span>-nDCG, 3.0% in ERR-IA, 3.3% in NRBP, and 0.9% in S-rec over strong baseline methods. Ablation studies and comparative analyses of different reward computation methods further validate the effectiveness of the proposed approach.</div></div>","PeriodicalId":19268,"journal":{"name":"Neurocomputing","volume":"648 ","pages":"Article 130723"},"PeriodicalIF":5.5000,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GSRDR-GAN: Global search result diversification ranking approach based on multi-head self-attention and GAN\",\"authors\":\"Weidong Liu , Jinzhong Li , Shengbo Chen\",\"doi\":\"10.1016/j.neucom.2025.130723\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Search result diversification ranking aims to generate rankings that comprehensively cover multiple subtopics, but existing methods often struggle to balance ranking diversity with relevance and face challenges in modeling document interactions and dealing with limited high-quality training data. While GAN have proven highly successful in fields like computer vision, their application to search result diversification has been limited due to the discrete nature of ranking items and the complex interactions among documents. To address these challenges, we propose GSRDR-GAN, a novel approach that integrates multi-head self-attention with GAN. Our method consists of four key components designed to address the limitations of traditional approaches: the Selected Document State Retriever, the Subtopic Encoder with Multi-head Self-Attention, the Subtopic Decoder with Multi-head Self-Attention, and the Relevance Predictor. First, a self-attention-based feature extraction module is employed to enhance document representations, enabling the model to capture both global and local context effectively. Second, a GAN framework is introduced to improve generalization by generating diverse rankings, mitigating limited high-quality training data. Third, a carefully designed reward function optimizes the trade-off between ranking diversity and relevance, allowing the model to adaptively prioritize these competing objectives during training. Notably, the method improves the generator’s stability and the diversity of search results by reducing training variance, even without pre-trained models. Extensive experiments on the TREC Web Track dataset demonstrate that the proposed GSRDR-GAN method significantly enhances result diversity, achieving relative improvements of 1.7% in <span><math><mi>α</mi></math></span>-nDCG, 3.0% in ERR-IA, 3.3% in NRBP, and 0.9% in S-rec over strong baseline methods. Ablation studies and comparative analyses of different reward computation methods further validate the effectiveness of the proposed approach.</div></div>\",\"PeriodicalId\":19268,\"journal\":{\"name\":\"Neurocomputing\",\"volume\":\"648 \",\"pages\":\"Article 130723\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-06-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurocomputing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0925231225013955\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurocomputing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0925231225013955","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

搜索结果多样化排名旨在生成全面覆盖多个子主题的排名，但现有方法往往难以平衡排名多样性与相关性，并且在文档交互建模和处理有限的高质量训练数据方面面临挑战。虽然GAN在计算机视觉等领域已经被证明非常成功，但由于排序项目的离散性和文档之间复杂的相互作用，它们在搜索结果多样化方面的应用受到了限制。为了解决这些挑战，我们提出了GSRDR-GAN，这是一种将多头自我注意与GAN相结合的新方法。我们的方法由四个关键组件组成，旨在解决传统方法的局限性：选定文档状态检索器，具有多头自注意的子主题编码器，具有多头自注意的子主题解码器和相关性预测器。首先，采用基于自关注的特征提取模块增强文档表示，使模型能够有效地捕获全局和局部上下文。其次，引入GAN框架，通过生成不同的排名来提高泛化，减轻有限的高质量训练数据。第三，一个精心设计的奖励函数优化了排名多样性和相关性之间的权衡，允许模型在训练期间自适应地优先考虑这些竞争目标。值得注意的是，该方法通过减少训练方差提高了生成器的稳定性和搜索结果的多样性，即使没有预先训练的模型。在TREC Web Track数据集上进行的大量实验表明，GSRDR-GAN方法显著增强了结果的多样性，与强基线方法相比，α-nDCG的相对改善率为1.7%，ERR-IA的相对改善率为3.0%，NRBP的相对改善率为3.3%，S-rec的相对改善率为0.9%。通过对不同奖励计算方法的实验研究和对比分析，进一步验证了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

GSRDR-GAN: Global search result diversification ranking approach based on multi-head self-attention and GAN

Search result diversification ranking aims to generate rankings that comprehensively cover multiple subtopics, but existing methods often struggle to balance ranking diversity with relevance and face challenges in modeling document interactions and dealing with limited high-quality training data. While GAN have proven highly successful in fields like computer vision, their application to search result diversification has been limited due to the discrete nature of ranking items and the complex interactions among documents. To address these challenges, we propose GSRDR-GAN, a novel approach that integrates multi-head self-attention with GAN. Our method consists of four key components designed to address the limitations of traditional approaches: the Selected Document State Retriever, the Subtopic Encoder with Multi-head Self-Attention, the Subtopic Decoder with Multi-head Self-Attention, and the Relevance Predictor. First, a self-attention-based feature extraction module is employed to enhance document representations, enabling the model to capture both global and local context effectively. Second, a GAN framework is introduced to improve generalization by generating diverse rankings, mitigating limited high-quality training data. Third, a carefully designed reward function optimizes the trade-off between ranking diversity and relevance, allowing the model to adaptively prioritize these competing objectives during training. Notably, the method improves the generator’s stability and the diversity of search results by reducing training variance, even without pre-trained models. Extensive experiments on the TREC Web Track dataset demonstrate that the proposed GSRDR-GAN method significantly enhances result diversity, achieving relative improvements of 1.7% in

α

-nDCG, 3.0% in ERR-IA, 3.3% in NRBP, and 0.9% in S-rec over strong baseline methods. Ablation studies and comparative analyses of different reward computation methods further validate the effectiveness of the proposed approach.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Neurocomputing 工程技术-计算机：人工智能

CiteScore

13.10

自引率

10.00%

发文量

1382

审稿时长

70 days

期刊介绍： Neurocomputing publishes articles describing recent fundamental contributions in the field of neurocomputing. Neurocomputing theory, practice and applications are the essential topics being covered.