Aligning with ideal values: a proposal for anchoring AI in moral expertise

Erich Riesen, Mark Boespflug
{"title":"Aligning with ideal values: a proposal for anchoring AI in moral expertise","authors":"Erich Riesen,&nbsp;Mark Boespflug","doi":"10.1007/s43681-025-00664-1","DOIUrl":null,"url":null,"abstract":"<div><p>Autonomous AI agents are increasingly required to operate in contexts where human welfare is at stake, raising the imperative for them to act in ways that are morally optimal—or at least morally permissible. The value alignment research program seeks to create “beneficial AI” by aligning AI behavior with human values (Russell in Human compatible: artificial intelligence and the problem of control, Penguin, London, 2019). In this article, we propose a method for specifying permissible outcomes for AI agents that targets ideal values via moral expertise as embodied in the collective judgments of philosophical ethicists. We defend the notion that ethicists are moral experts against several objections found in the recent literature and argue that their aggregated judgments offer the epistemically best available proxy for moral truth. We recommend a systematic study of ethicists’ judgments—using tools from social psychology and social choice theory—to guide AI agents' behavior in morally complex situations.</p></div>","PeriodicalId":72137,"journal":{"name":"AI and ethics","volume":"5 4","pages":"3727 - 3741"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI and ethics","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s43681-025-00664-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Autonomous AI agents are increasingly required to operate in contexts where human welfare is at stake, raising the imperative for them to act in ways that are morally optimal—or at least morally permissible. The value alignment research program seeks to create “beneficial AI” by aligning AI behavior with human values (Russell in Human compatible: artificial intelligence and the problem of control, Penguin, London, 2019). In this article, we propose a method for specifying permissible outcomes for AI agents that targets ideal values via moral expertise as embodied in the collective judgments of philosophical ethicists. We defend the notion that ethicists are moral experts against several objections found in the recent literature and argue that their aggregated judgments offer the epistemically best available proxy for moral truth. We recommend a systematic study of ethicists’ judgments—using tools from social psychology and social choice theory—to guide AI agents' behavior in morally complex situations.

与理想价值观保持一致:将人工智能锚定在道德专业知识中的建议
越来越多的人要求自主的人工智能代理在人类福利受到威胁的情况下运作,这就要求它们以道德上最优的方式行事,或者至少在道德上是允许的。价值一致性研究计划旨在通过使人工智能行为与人类价值观保持一致来创造“有益的人工智能”(Russell在human compatible: artificial intelligence and problem of control, Penguin, London, 2019)。在本文中,我们提出了一种方法,通过哲学伦理学家的集体判断中体现的道德专业知识,为以理想价值观为目标的人工智能代理指定允许的结果。我们捍卫伦理学家是道德专家的概念,反对最近文献中发现的一些反对意见,并认为他们的综合判断提供了道德真理的最佳认识论代理。我们建议对伦理学家的判断进行系统的研究——使用社会心理学和社会选择理论的工具——以指导人工智能代理在道德复杂情况下的行为。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信