{"title":"Propensity-Dependent Model for Unbiased Learning-to-Rank","authors":"Haochen Zhang, Ziqing Wu, Zhuoran Peng, Tiancheng Luo, Xianna Weng","doi":"10.1145/3582084.3582085","DOIUrl":null,"url":null,"abstract":"Most present unbiased learning-to-rank models are based on the trust bias assumption to learn a ranking policy by Inverse Propensity Scoring (IPS). The trust bias assumption improves the unrealistic noise-free assumption in the Position-Based model, but it assumes that the propensities are independent. In this paper, we improve this assumption and consider that the propensities of different positions are relevant. In particular, we model the relationship between different propensities as a Propensity-Dependent model and use both IPS estimator and the Doubly Robust estimator to learn the optimal ranking policy. Finally, we generate the dataset in a simulated study and then evaluate the model's performance.","PeriodicalId":177325,"journal":{"name":"Proceedings of the 2022 4th International Conference on Software Engineering and Development","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 4th International Conference on Software Engineering and Development","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3582084.3582085","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Most present unbiased learning-to-rank models are based on the trust bias assumption to learn a ranking policy by Inverse Propensity Scoring (IPS). The trust bias assumption improves the unrealistic noise-free assumption in the Position-Based model, but it assumes that the propensities are independent. In this paper, we improve this assumption and consider that the propensities of different positions are relevant. In particular, we model the relationship between different propensities as a Propensity-Dependent model and use both IPS estimator and the Doubly Robust estimator to learn the optimal ranking policy. Finally, we generate the dataset in a simulated study and then evaluate the model's performance.