Formal Ethical Obligations in Reinforcement Learning Agents: Verification and Policy Updates

Colin Shea-Blymyer, Houssam Abbas
{"title":"Formal Ethical Obligations in Reinforcement Learning Agents: Verification and Policy Updates","authors":"Colin Shea-Blymyer, Houssam Abbas","doi":"arxiv-2408.00147","DOIUrl":null,"url":null,"abstract":"When designing agents for operation in uncertain environments, designers need\ntools to automatically reason about what agents ought to do, how that conflicts\nwith what is actually happening, and how a policy might be modified to remove\nthe conflict. These obligations include ethical and social obligations,\npermissions and prohibitions, which constrain how the agent achieves its\nmission and executes its policy. We propose a new deontic logic, Expected Act\nUtilitarian deontic logic, for enabling this reasoning at design time: for\nspecifying and verifying the agent's strategic obligations, then modifying its\npolicy from a reference policy to meet those obligations. Unlike approaches\nthat work at the reward level, working at the logical level increases the\ntransparency of the trade-offs. We introduce two algorithms: one for\nmodel-checking whether an RL agent has the right strategic obligations, and one\nfor modifying a reference decision policy to make it meet obligations expressed\nin our logic. We illustrate our algorithms on DAC-MDPs which accurately\nabstract neural decision policies, and on toy gridworld environments.","PeriodicalId":501208,"journal":{"name":"arXiv - CS - Logic in Computer Science","volume":"75 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Logic in Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.00147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

When designing agents for operation in uncertain environments, designers need tools to automatically reason about what agents ought to do, how that conflicts with what is actually happening, and how a policy might be modified to remove the conflict. These obligations include ethical and social obligations, permissions and prohibitions, which constrain how the agent achieves its mission and executes its policy. We propose a new deontic logic, Expected Act Utilitarian deontic logic, for enabling this reasoning at design time: for specifying and verifying the agent's strategic obligations, then modifying its policy from a reference policy to meet those obligations. Unlike approaches that work at the reward level, working at the logical level increases the transparency of the trade-offs. We introduce two algorithms: one for model-checking whether an RL agent has the right strategic obligations, and one for modifying a reference decision policy to make it meet obligations expressed in our logic. We illustrate our algorithms on DAC-MDPs which accurately abstract neural decision policies, and on toy gridworld environments.
强化学习代理中的正式道德义务:验证和政策更新
当设计在不确定环境中运行的代理时,设计者需要一些工具来自动推理代理应该做什么,与实际发生的情况有什么冲突,以及如何修改策略以消除冲突。这些义务包括道德和社会义务、权限和禁令,它们制约着代理如何实现其使命和执行其策略。我们提出了一种新的推理逻辑--预期行为功利推理逻辑(Expected ActUtilitarian deontic logic),用于在设计时进行这种推理:指定和验证代理的战略义务,然后根据参考策略修改代理的策略以履行这些义务。与奖赏层面的方法不同,逻辑层面的方法增加了权衡的透明度。我们引入了两种算法:一种用于模型检查 RL 代理是否具有正确的战略义务,另一种用于修改参考决策策略,使其满足我们逻辑中表达的义务。我们将在精确抽象神经决策策略的 DAC-MDPs 和玩具网格世界环境中说明我们的算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信