Towards long-term depolarized interactive recommendations

IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Mohamed Lechiakh, Zakaria El-Moutaouakkil, Alexandre Maurer
{"title":"Towards long-term depolarized interactive recommendations","authors":"Mohamed Lechiakh,&nbsp;Zakaria El-Moutaouakkil,&nbsp;Alexandre Maurer","doi":"10.1016/j.ipm.2024.103833","DOIUrl":null,"url":null,"abstract":"<div><p>Personalization is a prominent process in today’s recommender systems (RS) that enhances user satisfaction and platform profitability. However, recent studies suggest that over-personalization may lead to polarized user preferences, which can result in filter bubbles and echo-chamber effects. These effects have usually been mitigated by focusing on short-term recommendation goals using immediate polarization solutions in static RS settings. In this work, we explore the problem of long-term user polarization resulting from over-personalized multi-step interactive recommendations. We propose a framework to measure and limit the polarization of user preferences, based on item categories consumed over continuous <span><math><mrow><mi>T</mi><mo>−</mo></mrow></math></span>step recommendations. In this framework, we developed three recommendation approaches based on Deep Q-Networks (DQN), each one incorporating distinct polarization constraining and training techniques. First, we proposed I-CDQN, an instantaneously constrained DQN algorithm in which user polarization is forced to remain below a certain threshold at each recommendation step. Second, we proposed RP-DQN, a DQN-based method that incorporates polarization penalization terms into the reward and DQN loss function. Third, we introduced RC-DQN with a double DQN architecture, which constrains user polarization at the category-level using the first DQN, then trains the second unconstrained DQN using items from restricted category-related action spaces. The proposed methods differ in the way they apply polarization constraints, which can significantly impact their performance and suitability for specific application use cases. We conducted extensive experiments on real world datasets using cold- and warm-start scenarios for <span><math><mrow><mi>T</mi><mo>−</mo></mrow></math></span>step interactive recommendations. Interestingly, RC-DQN outperforms both I-CDQN and RP-DQN, demonstrating the best balance between user polarization and personalization, and achieving significant improvement in personalization results when compared to the best performing baseline methods across all experiments, e.g., about 3.6% for <span><math><mrow><mi>T</mi><mo>=</mo><mn>30</mn></mrow></math></span> steps.</p></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324001924","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Personalization is a prominent process in today’s recommender systems (RS) that enhances user satisfaction and platform profitability. However, recent studies suggest that over-personalization may lead to polarized user preferences, which can result in filter bubbles and echo-chamber effects. These effects have usually been mitigated by focusing on short-term recommendation goals using immediate polarization solutions in static RS settings. In this work, we explore the problem of long-term user polarization resulting from over-personalized multi-step interactive recommendations. We propose a framework to measure and limit the polarization of user preferences, based on item categories consumed over continuous Tstep recommendations. In this framework, we developed three recommendation approaches based on Deep Q-Networks (DQN), each one incorporating distinct polarization constraining and training techniques. First, we proposed I-CDQN, an instantaneously constrained DQN algorithm in which user polarization is forced to remain below a certain threshold at each recommendation step. Second, we proposed RP-DQN, a DQN-based method that incorporates polarization penalization terms into the reward and DQN loss function. Third, we introduced RC-DQN with a double DQN architecture, which constrains user polarization at the category-level using the first DQN, then trains the second unconstrained DQN using items from restricted category-related action spaces. The proposed methods differ in the way they apply polarization constraints, which can significantly impact their performance and suitability for specific application use cases. We conducted extensive experiments on real world datasets using cold- and warm-start scenarios for Tstep interactive recommendations. Interestingly, RC-DQN outperforms both I-CDQN and RP-DQN, demonstrating the best balance between user polarization and personalization, and achieving significant improvement in personalization results when compared to the best performing baseline methods across all experiments, e.g., about 3.6% for T=30 steps.

实现长期去极化互动建议
个性化是当今推荐系统(RS)的一个重要流程,它能提高用户满意度和平台盈利能力。然而,最近的研究表明,过度个性化可能会导致用户偏好极化,从而产生过滤泡沫和回音室效应。这些影响通常是通过在静态 RS 设置中使用即时极化解决方案,专注于短期推荐目标而得到缓解的。在这项工作中,我们探讨了过度个性化的多步骤交互式推荐所导致的长期用户极化问题。我们根据连续 T 步推荐所消耗的项目类别,提出了一个衡量和限制用户偏好极化的框架。在这个框架中,我们开发了三种基于深度 Q 网络(DQN)的推荐方法,每种方法都采用了不同的极化限制和训练技术。首先,我们提出了 I-CDQN,这是一种瞬时约束的 DQN 算法,在这种算法中,用户极化在每个推荐步骤中都会被强制保持在某个阈值以下。其次,我们提出了 RP-DQN,这是一种基于 DQN 的方法,在奖励和 DQN 损失函数中加入了极化惩罚项。第三,我们引入了具有双 DQN 架构的 RC-DQN,该架构使用第一个 DQN 在类别层面对用户极化进行约束,然后使用来自受限类别相关行动空间的项目训练第二个无约束 DQN。所提出的方法在应用极化约束的方式上各不相同,这会极大地影响其性能和对特定应用用例的适用性。我们在真实世界的数据集上进行了广泛的实验,使用冷启动和热启动场景进行 T 步交互式推荐。有趣的是,RC-DQN 的表现优于 I-CDQN 和 RP-DQN,在用户极化和个性化之间实现了最佳平衡,与所有实验中表现最好的基线方法相比,个性化结果有了显著提高,例如,在 T=30 步时,提高了约 3.6%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information Processing & Management
Information Processing & Management 工程技术-计算机:信息系统
CiteScore
17.00
自引率
11.60%
发文量
276
审稿时长
39 days
期刊介绍: Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信