{"title":"DVGAN","authors":"Jiongnan Liu, Zhicheng Dou, Xiaojie Wang, Shuqi Lu, Ji-rong Wen","doi":"10.1145/3397271.3401084","DOIUrl":null,"url":null,"abstract":"Search result diversification aims to retrieve diverse results to cover as many subtopics related to the query as possible. Recent studies showed that supervised diversification models are able to outperform the heuristic approaches, by automatically learning a diversification function other than using manually designed score functions. The main challenge of training a diversification model is the lack of high-quality training samples. Due to the involvement of dependence between documents in the ranker, it is very hard for training algorithms to select effective positive and negative ranking lists to train a reliable ranking model, given a large number of candidate documents within which different documents are relevant to different subtopics. To tackle this problem, we propose a supervised diversification framework based on Generative Adversarial Network (GAN). It consists of a generator and a discriminator interacting with each other in a minimax game. Specifically, the generator generates more confusing negative samples for the discriminator, and the discriminator sends back complementary ranking signals to the generator. Furthermore, we explicitly exploit subtopics in the generator, whereas focusing on modeling document similarity in the discriminator. Through such a minimax game, we are able to obtain better ranking models by combining ranking signals learned by the generator and the discriminator. Experimental results on the TREC Web Track dataset show that the proposed method can significantly outperform existing diversification methods.","PeriodicalId":252050,"journal":{"name":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3397271.3401084","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20
Abstract
Search result diversification aims to retrieve diverse results to cover as many subtopics related to the query as possible. Recent studies showed that supervised diversification models are able to outperform the heuristic approaches, by automatically learning a diversification function other than using manually designed score functions. The main challenge of training a diversification model is the lack of high-quality training samples. Due to the involvement of dependence between documents in the ranker, it is very hard for training algorithms to select effective positive and negative ranking lists to train a reliable ranking model, given a large number of candidate documents within which different documents are relevant to different subtopics. To tackle this problem, we propose a supervised diversification framework based on Generative Adversarial Network (GAN). It consists of a generator and a discriminator interacting with each other in a minimax game. Specifically, the generator generates more confusing negative samples for the discriminator, and the discriminator sends back complementary ranking signals to the generator. Furthermore, we explicitly exploit subtopics in the generator, whereas focusing on modeling document similarity in the discriminator. Through such a minimax game, we are able to obtain better ranking models by combining ranking signals learned by the generator and the discriminator. Experimental results on the TREC Web Track dataset show that the proposed method can significantly outperform existing diversification methods.