LLM-as-a-Judge & Reward Model: What They Can and Cannot Do

Guijin Son, Hyunwoo Ko, Hoyoung Lee, Yewon Kim, Seunghyeok Hong
{"title":"LLM-as-a-Judge & Reward Model: What They Can and Cannot Do","authors":"Guijin Son, Hyunwoo Ko, Hoyoung Lee, Yewon Kim, Seunghyeok Hong","doi":"arxiv-2409.11239","DOIUrl":null,"url":null,"abstract":"LLM-as-a-Judge and reward models are widely used alternatives of\nmultiple-choice questions or human annotators for large language model (LLM)\nevaluation. Their efficacy shines in evaluating long-form responses, serving a\ncritical role as evaluators of leaderboards and as proxies to align LLMs via\nreinforcement learning. However, despite their popularity, their effectiveness\noutside of English remains largely unexplored. In this paper, we conduct a\ncomprehensive analysis on automated evaluators, reporting key findings on their\nbehavior in a non-English environment. First, we discover that English\nevaluation capabilities significantly influence language-specific capabilities,\noften more than the language proficiency itself, enabling evaluators trained in\nEnglish to easily transfer their skills to other languages. Second, we identify\ncritical shortcomings, where LLMs fail to detect and penalize errors, such as\nfactual inaccuracies, cultural misrepresentations, and the presence of unwanted\nlanguage. Finally, we release Kudge, the first non-English meta-evaluation\ndataset containing 5,012 human annotations in Korean.","PeriodicalId":501030,"journal":{"name":"arXiv - CS - Computation and Language","volume":"50 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Computation and Language","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11239","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

LLM-as-a-Judge and reward models are widely used alternatives of multiple-choice questions or human annotators for large language model (LLM) evaluation. Their efficacy shines in evaluating long-form responses, serving a critical role as evaluators of leaderboards and as proxies to align LLMs via reinforcement learning. However, despite their popularity, their effectiveness outside of English remains largely unexplored. In this paper, we conduct a comprehensive analysis on automated evaluators, reporting key findings on their behavior in a non-English environment. First, we discover that English evaluation capabilities significantly influence language-specific capabilities, often more than the language proficiency itself, enabling evaluators trained in English to easily transfer their skills to other languages. Second, we identify critical shortcomings, where LLMs fail to detect and penalize errors, such as factual inaccuracies, cultural misrepresentations, and the presence of unwanted language. Finally, we release Kudge, the first non-English meta-evaluation dataset containing 5,012 human annotations in Korean.
法学硕士担任法官与奖励模式:他们能做什么,不能做什么
在大型语言模型(LLM)评估中,LLM-as-a-Judge 和奖励模型被广泛用于替代多选题或人类注释者。它们在评估长式回答时发挥了重要作用,既是排行榜的评估者,也是通过强化学习调整 LLM 的代理。然而,尽管它们很受欢迎,但它们在英语之外的有效性在很大程度上仍未得到探索。在本文中,我们对自动评价器进行了全面分析,报告了它们在非英语环境中行为的主要发现。首先,我们发现英语评估能力对特定语言能力的影响很大,往往比语言能力本身的影响更大,这使得接受过英语培训的评估员能够轻松地将他们的技能转移到其他语言上。其次,我们发现了 LLM 的关键缺陷,即 LLM 无法检测和惩罚错误,如事实不准确、文化表述错误和存在不需要的语言。最后,我们发布了首个非英语元评价数据集 Kudge,其中包含 5012 个韩语人工注释。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信