User Simulator Assisted Open-ended Conversational Recommendation System

Proceedings of the 5th Workshop on NLP for Conversational AI (NLP4ConvAI 2023) Pub Date : 1900-01-01 DOI:10.18653/v1/2023.nlp4convai-1.8

Qiusi Zhan, Xiaojie Guo, Heng Ji, Lingfei Wu

引用次数: 0

Abstract

Conversational recommendation systems (CRS) have gained popularity in e-commerce as they can recommend items during user interactions. However, current open-ended CRS have limited recommendation performance due to their short-sighted training process, which only predicts one utterance at a time without considering its future impact. To address this, we propose a User Simulator (US) that communicates with the CRS using natural language based on given user preferences, enabling long-term reinforcement learning. We also introduce a framework that uses reinforcement learning (RL) with two novel rewards, i.e., recommendation and conversation rewards, to train the CRS. This approach considers the long-term goals and improves both the conversation and recommendation performance of the CRS. Our experiments show that our proposed framework improves the recall of recommendations by almost 100%. Moreover, human evaluation demonstrates the superiority of our framework in enhancing the informativeness of generated utterances.

查看原文本刊更多论文

用户模拟器辅助开放式会话推荐系统

会话推荐系统(CRS)可以在用户交互过程中推荐商品，因此在电子商务中得到了广泛的应用。然而，目前的开放式CRS由于训练过程的短视，每次只预测一个话语，而没有考虑其未来的影响，因此推荐效果有限。为了解决这个问题，我们提出了一个用户模拟器(US)，它使用基于给定用户偏好的自然语言与CRS通信，从而实现长期强化学习。我们还引入了一个框架，该框架使用带有两种新颖奖励的强化学习(RL)，即推荐和对话奖励，来训练CRS。这种方法考虑了长期目标，并提高了CRS的会话和推荐性能。我们的实验表明，我们提出的框架将推荐的召回率提高了近100%。此外，人类评价证明了我们的框架在增强生成话语的信息性方面的优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 5th Workshop on NLP for Conversational AI (NLP4ConvAI 2023)

自引率

0.00%

发文量