利用反事实数据增强的去伪存真变分自动编码器

IF 5 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Complex & Intelligent Systems Pub Date : 2024-01-12 DOI:10.1007/s40747-023-01314-x

Yupu Guo, Fei Cai, Jianming Zheng, Xin Zhang, Honghui Chen

{"title":"利用反事实数据增强的去伪存真变分自动编码器","authors":"Yupu Guo, Fei Cai, Jianming Zheng, Xin Zhang, Honghui Chen","doi":"10.1007/s40747-023-01314-x","DOIUrl":null,"url":null,"abstract":"<p>Recommender system always suffers from various recommendation biases, seriously hindering its development. In this light, a series of debias methods have been proposed in the recommender system, especially for two most common biases, i.e., popularity bias and amplified subjective bias. However, existing debias methods usually concentrate on correcting a single bias. Such single-functionality debiases neglect the bias-coupling issue in which the recommended items are collectively attributed to multiple biases. Besides, previous work cannot tackle the lacking supervised signals brought by sparse data, yet which has become a commonplace in the recommender system. In this work, we introduce a disentangled debias variational auto-encoder framework (DB-VAE) to address the single-functionality issue as well as a counterfactual data enhancement method to mitigate the adverse effect due to the data sparsity. In specific, DB-VAE first extracts two types of extreme items only affected by a single bias based on the collier theory, which are, respectively, employed to learn the latent representation of corresponding biases, thereby realizing the bias decoupling. In this way, the exact unbiased user representation can be learned by these decoupled bias representations. Furthermore, the data generation module employs Pearl’s framework to produce massive counterfactual data to help fully train the model, making up the lacking supervised signals due to the sparse data. Extensive experiments on three real-world data sets demonstrate the effectiveness of our proposed model. Specifically, our model outperforms the best baseline by 19.5% in terms of Recall@20 and 9.5% in terms of NDCG@100 in the best scenario. Besides, the counterfactual data can further improve DB-VAE, especially on the data set with low sparsity.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"54 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Disentangled variational auto-encoder enhanced by counterfactual data for debiasing recommendation\",\"authors\":\"Yupu Guo, Fei Cai, Jianming Zheng, Xin Zhang, Honghui Chen\",\"doi\":\"10.1007/s40747-023-01314-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Recommender system always suffers from various recommendation biases, seriously hindering its development. In this light, a series of debias methods have been proposed in the recommender system, especially for two most common biases, i.e., popularity bias and amplified subjective bias. However, existing debias methods usually concentrate on correcting a single bias. Such single-functionality debiases neglect the bias-coupling issue in which the recommended items are collectively attributed to multiple biases. Besides, previous work cannot tackle the lacking supervised signals brought by sparse data, yet which has become a commonplace in the recommender system. In this work, we introduce a disentangled debias variational auto-encoder framework (DB-VAE) to address the single-functionality issue as well as a counterfactual data enhancement method to mitigate the adverse effect due to the data sparsity. In specific, DB-VAE first extracts two types of extreme items only affected by a single bias based on the collier theory, which are, respectively, employed to learn the latent representation of corresponding biases, thereby realizing the bias decoupling. In this way, the exact unbiased user representation can be learned by these decoupled bias representations. Furthermore, the data generation module employs Pearl’s framework to produce massive counterfactual data to help fully train the model, making up the lacking supervised signals due to the sparse data. Extensive experiments on three real-world data sets demonstrate the effectiveness of our proposed model. Specifically, our model outperforms the best baseline by 19.5% in terms of Recall@20 and 9.5% in terms of NDCG@100 in the best scenario. Besides, the counterfactual data can further improve DB-VAE, especially on the data set with low sparsity.</p>\",\"PeriodicalId\":10524,\"journal\":{\"name\":\"Complex & Intelligent Systems\",\"volume\":\"54 1\",\"pages\":\"\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2024-01-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Complex & Intelligent Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s40747-023-01314-x\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-023-01314-x","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

推荐系统总是受到各种推荐偏差的困扰，严重阻碍了其发展。有鉴于此，人们在推荐系统中提出了一系列去偏差方法，尤其是针对两种最常见的偏差，即人气偏差和放大的主观偏差。然而，现有的去偏差方法通常只专注于纠正单一偏差。这种单一功能的去偏差方法忽视了偏差耦合问题，即推荐项目是由多种偏差共同造成的。此外，以往的工作无法解决稀疏数据带来的监督信号缺乏问题，而这已经成为推荐系统中的一个普遍问题。在这项工作中，我们引入了一个分离的debias变异自动编码器框架（DB-VAE）来解决单一功能问题，并引入了一种反事实数据增强方法来减轻数据稀疏性带来的不利影响。具体来说，DB-VAE 首先根据 Collier 理论提取出两类仅受单一偏差影响的极端项，分别用于学习相应偏差的潜在表示，从而实现偏差解耦。这样，就可以通过这些解耦偏差表征学习到准确的无偏差用户表征。此外，数据生成模块采用 Pearl 的框架生成大量反事实数据，以帮助充分训练模型，弥补因数据稀疏而缺少的监督信号。在三个真实世界数据集上进行的广泛实验证明了我们提出的模型的有效性。具体来说，在最佳情况下，我们的模型在 Recall@20 和 NDCG@100 方面分别比最佳基线高出 19.5% 和 9.5%。此外，反事实数据还能进一步改善 DB-VAE，尤其是在低稀疏性数据集上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Disentangled variational auto-encoder enhanced by counterfactual data for debiasing recommendation

查看原文本刊更多论文

Disentangled variational auto-encoder enhanced by counterfactual data for debiasing recommendation

Recommender system always suffers from various recommendation biases, seriously hindering its development. In this light, a series of debias methods have been proposed in the recommender system, especially for two most common biases, i.e., popularity bias and amplified subjective bias. However, existing debias methods usually concentrate on correcting a single bias. Such single-functionality debiases neglect the bias-coupling issue in which the recommended items are collectively attributed to multiple biases. Besides, previous work cannot tackle the lacking supervised signals brought by sparse data, yet which has become a commonplace in the recommender system. In this work, we introduce a disentangled debias variational auto-encoder framework (DB-VAE) to address the single-functionality issue as well as a counterfactual data enhancement method to mitigate the adverse effect due to the data sparsity. In specific, DB-VAE first extracts two types of extreme items only affected by a single bias based on the collier theory, which are, respectively, employed to learn the latent representation of corresponding biases, thereby realizing the bias decoupling. In this way, the exact unbiased user representation can be learned by these decoupled bias representations. Furthermore, the data generation module employs Pearl’s framework to produce massive counterfactual data to help fully train the model, making up the lacking supervised signals due to the sparse data. Extensive experiments on three real-world data sets demonstrate the effectiveness of our proposed model. Specifically, our model outperforms the best baseline by 19.5% in terms of Recall@20 and 9.5% in terms of NDCG@100 in the best scenario. Besides, the counterfactual data can further improve DB-VAE, especially on the data set with low sparsity.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

9.60

自引率

10.30%

发文量

297

期刊介绍： Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.