{"title":"Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model","authors":"H. Deng, Colin Raffel","doi":"10.18653/v1/2023.emnlp-main.721","DOIUrl":null,"url":null,"abstract":"While large language models have proven effective in a huge range of downstream applications, they often generate text that is problematic or lacks a desired attribute. In this paper, we introduce Reward-Augmented Decoding (RAD), a text generation procedure that uses a small unidirectional reward model to encourage a language model to generate text that has certain properties. Specifically, RAD uses the reward model to score generations as they are produced and rescales sampling probabilities to favor high-reward tokens. By using a unidirectional reward model, RAD can cache activations from prior generation steps to decrease computational overhead. Through experiments on generating non-toxic and sentiment-controlled text, we demonstrate that RAD performs best among methods that change only the generation procedure and matches the performance of state-of-the-art methods that involve re-training the language model. We further validate that RAD is effective on very large language models while incurring a minimal computational overhead.","PeriodicalId":505350,"journal":{"name":"Conference on Empirical Methods in Natural Language Processing","volume":"4 1","pages":"11781-11791"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference on Empirical Methods in Natural Language Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2023.emnlp-main.721","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
While large language models have proven effective in a huge range of downstream applications, they often generate text that is problematic or lacks a desired attribute. In this paper, we introduce Reward-Augmented Decoding (RAD), a text generation procedure that uses a small unidirectional reward model to encourage a language model to generate text that has certain properties. Specifically, RAD uses the reward model to score generations as they are produced and rescales sampling probabilities to favor high-reward tokens. By using a unidirectional reward model, RAD can cache activations from prior generation steps to decrease computational overhead. Through experiments on generating non-toxic and sentiment-controlled text, we demonstrate that RAD performs best among methods that change only the generation procedure and matches the performance of state-of-the-art methods that involve re-training the language model. We further validate that RAD is effective on very large language models while incurring a minimal computational overhead.
虽然大型语言模型在大量下游应用中被证明是有效的,但它们生成的文本往往存在问题或缺乏所需的属性。在本文中,我们介绍了奖励增强解码(RAD),这是一种文本生成程序,它使用小型单向奖励模型来鼓励语言模型生成具有特定属性的文本。具体来说,RAD 使用奖励模型对生成的文本进行评分,并重新调整采样概率,使其更倾向于高奖励标记。通过使用单向奖励模型,RAD 可以缓存先前生成步骤的激活,从而减少计算开销。通过生成无毒文本和情感控制文本的实验,我们证明了 RAD 在仅改变生成过程的方法中表现最佳,其性能与涉及重新训练语言模型的最先进方法不相上下。我们进一步验证了 RAD 在超大语言模型上的有效性,同时将计算开销降到了最低。