{"title":"强化学习代理中的正式道德义务:验证和政策更新","authors":"Colin Shea-Blymyer, Houssam Abbas","doi":"arxiv-2408.00147","DOIUrl":null,"url":null,"abstract":"When designing agents for operation in uncertain environments, designers need\ntools to automatically reason about what agents ought to do, how that conflicts\nwith what is actually happening, and how a policy might be modified to remove\nthe conflict. These obligations include ethical and social obligations,\npermissions and prohibitions, which constrain how the agent achieves its\nmission and executes its policy. We propose a new deontic logic, Expected Act\nUtilitarian deontic logic, for enabling this reasoning at design time: for\nspecifying and verifying the agent's strategic obligations, then modifying its\npolicy from a reference policy to meet those obligations. Unlike approaches\nthat work at the reward level, working at the logical level increases the\ntransparency of the trade-offs. We introduce two algorithms: one for\nmodel-checking whether an RL agent has the right strategic obligations, and one\nfor modifying a reference decision policy to make it meet obligations expressed\nin our logic. We illustrate our algorithms on DAC-MDPs which accurately\nabstract neural decision policies, and on toy gridworld environments.","PeriodicalId":501208,"journal":{"name":"arXiv - CS - Logic in Computer Science","volume":"75 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Formal Ethical Obligations in Reinforcement Learning Agents: Verification and Policy Updates\",\"authors\":\"Colin Shea-Blymyer, Houssam Abbas\",\"doi\":\"arxiv-2408.00147\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"When designing agents for operation in uncertain environments, designers need\\ntools to automatically reason about what agents ought to do, how that conflicts\\nwith what is actually happening, and how a policy might be modified to remove\\nthe conflict. These obligations include ethical and social obligations,\\npermissions and prohibitions, which constrain how the agent achieves its\\nmission and executes its policy. We propose a new deontic logic, Expected Act\\nUtilitarian deontic logic, for enabling this reasoning at design time: for\\nspecifying and verifying the agent's strategic obligations, then modifying its\\npolicy from a reference policy to meet those obligations. Unlike approaches\\nthat work at the reward level, working at the logical level increases the\\ntransparency of the trade-offs. We introduce two algorithms: one for\\nmodel-checking whether an RL agent has the right strategic obligations, and one\\nfor modifying a reference decision policy to make it meet obligations expressed\\nin our logic. We illustrate our algorithms on DAC-MDPs which accurately\\nabstract neural decision policies, and on toy gridworld environments.\",\"PeriodicalId\":501208,\"journal\":{\"name\":\"arXiv - CS - Logic in Computer Science\",\"volume\":\"75 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Logic in Computer Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.00147\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Logic in Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.00147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Formal Ethical Obligations in Reinforcement Learning Agents: Verification and Policy Updates
When designing agents for operation in uncertain environments, designers need
tools to automatically reason about what agents ought to do, how that conflicts
with what is actually happening, and how a policy might be modified to remove
the conflict. These obligations include ethical and social obligations,
permissions and prohibitions, which constrain how the agent achieves its
mission and executes its policy. We propose a new deontic logic, Expected Act
Utilitarian deontic logic, for enabling this reasoning at design time: for
specifying and verifying the agent's strategic obligations, then modifying its
policy from a reference policy to meet those obligations. Unlike approaches
that work at the reward level, working at the logical level increases the
transparency of the trade-offs. We introduce two algorithms: one for
model-checking whether an RL agent has the right strategic obligations, and one
for modifying a reference decision policy to make it meet obligations expressed
in our logic. We illustrate our algorithms on DAC-MDPs which accurately
abstract neural decision policies, and on toy gridworld environments.