理解还是操纵？反思现代推荐系统的在线性能收益

IF 5.4 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Information Systems Pub Date : 2023-12-15 DOI:10.1145/3637869

Zhengbang Zhu, Rongjun Qin, Junjie Huang, Xinyi Dai, Yang Yu†, Yong Yu, Weinan Zhang†

{"title":"理解还是操纵？反思现代推荐系统的在线性能收益","authors":"Zhengbang Zhu, Rongjun Qin, Junjie Huang, Xinyi Dai, Yang Yu†, Yong Yu, Weinan Zhang†","doi":"10.1145/3637869","DOIUrl":null,"url":null,"abstract":"<p>Recommender systems are expected to be assistants that help human users find relevant information automatically without explicit queries. As recommender systems evolve, increasingly sophisticated learning techniques are applied and have achieved better performance in terms of user engagement metrics such as clicks and browsing time. The increase in the measured performance, however, can have two possible attributions: a better understanding of user preferences, and a more proactive ability to utilize human bounded rationality to seduce user over-consumption. A natural following question is whether current recommendation algorithms are manipulating user preferences. If so, can we measure the manipulation level? In this paper, we present a general framework for benchmarking the degree of manipulations of recommendation algorithms, in both slate recommendation and sequential recommendation scenarios. The framework consists of four stages, initial preference calculation, training data collection, algorithm training and interaction, and metrics calculation that involves two proposed metrics, Manipulation Score and Preference Shift. We benchmark some representative recommendation algorithms in both synthetic and real-world datasets under the proposed framework. We have observed that a high online click-through rate does not necessarily mean a better understanding of user initial preference, but ends in prompting users to choose more documents they initially did not favor. Moreover, we find that the training data have notable impacts on the manipulation degrees, and algorithms with more powerful modeling abilities are more sensitive to such impacts. The experiments also verified the usefulness of the proposed metrics for measuring the degree of manipulations. We advocate that future recommendation algorithm studies should be treated as an optimization problem with constrained user preference manipulations.</p>","PeriodicalId":50936,"journal":{"name":"ACM Transactions on Information Systems","volume":"26 1","pages":""},"PeriodicalIF":5.4000,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems\",\"authors\":\"Zhengbang Zhu, Rongjun Qin, Junjie Huang, Xinyi Dai, Yang Yu†, Yong Yu, Weinan Zhang†\",\"doi\":\"10.1145/3637869\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Recommender systems are expected to be assistants that help human users find relevant information automatically without explicit queries. As recommender systems evolve, increasingly sophisticated learning techniques are applied and have achieved better performance in terms of user engagement metrics such as clicks and browsing time. The increase in the measured performance, however, can have two possible attributions: a better understanding of user preferences, and a more proactive ability to utilize human bounded rationality to seduce user over-consumption. A natural following question is whether current recommendation algorithms are manipulating user preferences. If so, can we measure the manipulation level? In this paper, we present a general framework for benchmarking the degree of manipulations of recommendation algorithms, in both slate recommendation and sequential recommendation scenarios. The framework consists of four stages, initial preference calculation, training data collection, algorithm training and interaction, and metrics calculation that involves two proposed metrics, Manipulation Score and Preference Shift. We benchmark some representative recommendation algorithms in both synthetic and real-world datasets under the proposed framework. We have observed that a high online click-through rate does not necessarily mean a better understanding of user initial preference, but ends in prompting users to choose more documents they initially did not favor. Moreover, we find that the training data have notable impacts on the manipulation degrees, and algorithms with more powerful modeling abilities are more sensitive to such impacts. The experiments also verified the usefulness of the proposed metrics for measuring the degree of manipulations. We advocate that future recommendation algorithm studies should be treated as an optimization problem with constrained user preference manipulations.</p>\",\"PeriodicalId\":50936,\"journal\":{\"name\":\"ACM Transactions on Information Systems\",\"volume\":\"26 1\",\"pages\":\"\"},\"PeriodicalIF\":5.4000,\"publicationDate\":\"2023-12-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Information Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/3637869\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Information Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3637869","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

推荐系统有望成为帮助人类用户在没有明确查询的情况下自动查找相关信息的助手。随着推荐系统的发展，人们应用了越来越复杂的学习技术，并在用户参与度指标（如点击量和浏览时间）方面取得了更好的性能。然而，衡量性能的提高可能有两个原因：一是对用户偏好有了更好的理解，二是能够更主动地利用人类的有限理性来诱导用户过度消费。接下来的一个自然问题是，当前的推荐算法是否操纵了用户偏好。如果是，我们能否衡量操纵程度？在本文中，我们提出了一个通用框架，用于在板块推荐和顺序推荐两种情况下对推荐算法的操纵程度进行基准测试。该框架包括四个阶段：初始偏好计算、训练数据收集、算法训练和交互，以及指标计算，其中涉及两个建议的指标：操纵分数和偏好偏移。我们根据提出的框架，在合成数据集和真实数据集中对一些具有代表性的推荐算法进行了基准测试。我们发现，在线点击率高并不一定意味着能更好地了解用户的初始偏好，而是会促使用户选择更多他们最初并不喜欢的文档。此外，我们还发现训练数据对操作度有显著影响，而建模能力更强的算法对这种影响更为敏感。实验还验证了所提出的衡量操纵程度的指标的实用性。我们主张，未来的推荐算法研究应将用户偏好操作作为一个优化问题来处理。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Understanding or Manipulation: Rethinking Online Performance Gains of Modern Recommender Systems

Recommender systems are expected to be assistants that help human users find relevant information automatically without explicit queries. As recommender systems evolve, increasingly sophisticated learning techniques are applied and have achieved better performance in terms of user engagement metrics such as clicks and browsing time. The increase in the measured performance, however, can have two possible attributions: a better understanding of user preferences, and a more proactive ability to utilize human bounded rationality to seduce user over-consumption. A natural following question is whether current recommendation algorithms are manipulating user preferences. If so, can we measure the manipulation level? In this paper, we present a general framework for benchmarking the degree of manipulations of recommendation algorithms, in both slate recommendation and sequential recommendation scenarios. The framework consists of four stages, initial preference calculation, training data collection, algorithm training and interaction, and metrics calculation that involves two proposed metrics, Manipulation Score and Preference Shift. We benchmark some representative recommendation algorithms in both synthetic and real-world datasets under the proposed framework. We have observed that a high online click-through rate does not necessarily mean a better understanding of user initial preference, but ends in prompting users to choose more documents they initially did not favor. Moreover, we find that the training data have notable impacts on the manipulation degrees, and algorithms with more powerful modeling abilities are more sensitive to such impacts. The experiments also verified the usefulness of the proposed metrics for measuring the degree of manipulations. We advocate that future recommendation algorithm studies should be treated as an optimization problem with constrained user preference manipulations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Information Systems 工程技术-计算机：信息系统

CiteScore

9.40

自引率

14.30%

发文量

165

审稿时长

>12 weeks

期刊介绍： The ACM Transactions on Information Systems (TOIS) publishes papers on information retrieval (such as search engines, recommender systems) that contain: new principled information retrieval models or algorithms with sound empirical validation; observational, experimental and/or theoretical studies yielding new insights into information retrieval or information seeking; accounts of applications of existing information retrieval techniques that shed light on the strengths and weaknesses of the techniques; formalization of new information retrieval or information seeking tasks and of methods for evaluating the performance on those tasks; development of content (text, image, speech, video, etc) analysis methods to support information retrieval and information seeking; development of computational models of user information preferences and interaction behaviors; creation and analysis of evaluation methodologies for information retrieval and information seeking; or surveys of existing work that propose a significant synthesis. The information retrieval scope of ACM Transactions on Information Systems (TOIS) appeals to industry practitioners for its wealth of creative ideas, and to academic researchers for its descriptions of their colleagues'' work.