Robustness of large language models in moral judgements.

IF 2.9 3区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES
Royal Society Open Science Pub Date : 2025-04-23 eCollection Date: 2025-04-01 DOI:10.1098/rsos.241229
Soyoung Oh, Vera Demberg
{"title":"Robustness of large language models in moral judgements.","authors":"Soyoung Oh, Vera Demberg","doi":"10.1098/rsos.241229","DOIUrl":null,"url":null,"abstract":"<p><p>With the advent of large language models (LLMs), there has been a growing interest in analysing the preferences encoded in LLMs in the context of morality. Recent work has tested LLMs on various moral judgement tasks and drawn conclusions regarding the alignment between LLMs and humans. The present contribution critically assesses the validity of the method and results employed in previous work for eliciting moral judgements from LLMs. We find that previous results are confounded by biases in the presentation of the options in moral judgement tasks and that LLM responses are highly sensitive to prompt formulation variants as simple as changing 'Case 1' and 'Case 2' to '(A)' and '(B)'. Our results hence indicate that previous conclusions on moral judgements of LLMs cannot be upheld. We make recommendations for more sound methodological setups for future studies.</p>","PeriodicalId":21525,"journal":{"name":"Royal Society Open Science","volume":"12 4","pages":"241229"},"PeriodicalIF":2.9000,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12015570/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Royal Society Open Science","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1098/rsos.241229","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/4/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

With the advent of large language models (LLMs), there has been a growing interest in analysing the preferences encoded in LLMs in the context of morality. Recent work has tested LLMs on various moral judgement tasks and drawn conclusions regarding the alignment between LLMs and humans. The present contribution critically assesses the validity of the method and results employed in previous work for eliciting moral judgements from LLMs. We find that previous results are confounded by biases in the presentation of the options in moral judgement tasks and that LLM responses are highly sensitive to prompt formulation variants as simple as changing 'Case 1' and 'Case 2' to '(A)' and '(B)'. Our results hence indicate that previous conclusions on moral judgements of LLMs cannot be upheld. We make recommendations for more sound methodological setups for future studies.

大型语言模型在道德判断中的鲁棒性。
随着大型语言模型(llm)的出现,人们对在道德背景下分析llm中编码的偏好越来越感兴趣。最近的工作在各种道德判断任务中测试了法学硕士,并得出了关于法学硕士与人类之间一致性的结论。目前的贡献批判性地评估了以前的工作中用于从法学硕士中引出道德判断的方法和结果的有效性。我们发现之前的结果被道德判断任务中选项呈现的偏差所混淆,法学硕士的反应对提示的表述变体非常敏感,比如将“情况1”和“情况2”更改为“(A)”和“(B)”。因此,我们的研究结果表明,之前关于法学硕士道德判断的结论是不成立的。我们为未来的研究提出了更合理的方法设置建议。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Royal Society Open Science
Royal Society Open Science Multidisciplinary-Multidisciplinary
CiteScore
6.00
自引率
0.00%
发文量
508
审稿时长
14 weeks
期刊介绍: Royal Society Open Science is a new open journal publishing high-quality original research across the entire range of science on the basis of objective peer-review. The journal covers the entire range of science and mathematics and will allow the Society to publish all the high-quality work it receives without the usual restrictions on scope, length or impact.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信