Formal Ethical Obligations in Reinforcement Learning Agents: Verification and Policy Updates

arXiv - CS - Logic in Computer Science Pub Date : 2024-07-31 DOI:arxiv-2408.00147

Colin Shea-Blymyer, Houssam Abbas

{"title":"Formal Ethical Obligations in Reinforcement Learning Agents: Verification and Policy Updates","authors":"Colin Shea-Blymyer, Houssam Abbas","doi":"arxiv-2408.00147","DOIUrl":null,"url":null,"abstract":"When designing agents for operation in uncertain environments, designers need\ntools to automatically reason about what agents ought to do, how that conflicts\nwith what is actually happening, and how a policy might be modified to remove\nthe conflict. These obligations include ethical and social obligations,\npermissions and prohibitions, which constrain how the agent achieves its\nmission and executes its policy. We propose a new deontic logic, Expected Act\nUtilitarian deontic logic, for enabling this reasoning at design time: for\nspecifying and verifying the agent's strategic obligations, then modifying its\npolicy from a reference policy to meet those obligations. Unlike approaches\nthat work at the reward level, working at the logical level increases the\ntransparency of the trade-offs. We introduce two algorithms: one for\nmodel-checking whether an RL agent has the right strategic obligations, and one\nfor modifying a reference decision policy to make it meet obligations expressed\nin our logic. We illustrate our algorithms on DAC-MDPs which accurately\nabstract neural decision policies, and on toy gridworld environments.","PeriodicalId":501208,"journal":{"name":"arXiv - CS - Logic in Computer Science","volume":"75 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Logic in Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.00147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

When designing agents for operation in uncertain environments, designers need tools to automatically reason about what agents ought to do, how that conflicts with what is actually happening, and how a policy might be modified to remove the conflict. These obligations include ethical and social obligations, permissions and prohibitions, which constrain how the agent achieves its mission and executes its policy. We propose a new deontic logic, Expected Act Utilitarian deontic logic, for enabling this reasoning at design time: for specifying and verifying the agent's strategic obligations, then modifying its policy from a reference policy to meet those obligations. Unlike approaches that work at the reward level, working at the logical level increases the transparency of the trade-offs. We introduce two algorithms: one for model-checking whether an RL agent has the right strategic obligations, and one for modifying a reference decision policy to make it meet obligations expressed in our logic. We illustrate our algorithms on DAC-MDPs which accurately abstract neural decision policies, and on toy gridworld environments.

查看原文本刊更多论文

强化学习代理中的正式道德义务：验证和政策更新

当设计在不确定环境中运行的代理时，设计者需要一些工具来自动推理代理应该做什么，与实际发生的情况有什么冲突，以及如何修改策略以消除冲突。这些义务包括道德和社会义务、权限和禁令，它们制约着代理如何实现其使命和执行其策略。我们提出了一种新的推理逻辑--预期行为功利推理逻辑（Expected ActUtilitarian deontic logic），用于在设计时进行这种推理：指定和验证代理的战略义务，然后根据参考策略修改代理的策略以履行这些义务。与奖赏层面的方法不同，逻辑层面的方法增加了权衡的透明度。我们引入了两种算法：一种用于模型检查 RL 代理是否具有正确的战略义务，另一种用于修改参考决策策略，使其满足我们逻辑中表达的义务。我们将在精确抽象神经决策策略的 DAC-MDPs 和玩具网格世界环境中说明我们的算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Logic in Computer Science

自引率

0.00%

发文量