Reinforcement Learning Interventions on Boundedly Rational Human Agents in Frictionful Tasks.

Proceedings of the ... International Joint Conference on Autonomous Agents and Multiagent Systems : AAMAS. International Joint Conference on Autonomous Agents and Multiagent Systems Pub Date : 2024-05-01 Epub Date: 2024-05-06

Eura Nofshin, Siddharth Swaroop, Weiwei Pan, Susan Murphy, Finale Doshi-Velez

{"title":"Reinforcement Learning Interventions on Boundedly Rational Human Agents in Frictionful Tasks.","authors":"Eura Nofshin, Siddharth Swaroop, Weiwei Pan, Susan Murphy, Finale Doshi-Velez","doi":"","DOIUrl":null,"url":null,"abstract":"Many important behavior changes are frictionful; they require individuals to expend effort over a long period with little immediate gratification. Here, an artificial intelligence (AI) agent can provide personalized interventions to help individuals stick to their goals. In these settings, the AI agent must personalize rapidly (before the individual disengages) and interpretably, to help us understand the behavioral interventions. In this paper, we introduce Behavior Model Reinforcement Learning (BMRL), a framework in which an AI agent intervenes on the parameters of a Markov Decision Process (MDP) belonging to a boundedly rational human agent. Our formulation of the human decision-maker as a planning agent allows us to attribute undesirable human policies (ones that do not lead to the goal) to their maladapted MDP parameters, such as an extremely low discount factor. Furthermore, we propose a class of tractable human models that captures fundamental behaviors in frictionful tasks. Introducing a notion of MDP equivalence specific to BMRL, we theoretically and empirically show that AI planning with our human models can lead to helpful policies on a wide range of more complex, ground-truth humans.","PeriodicalId":93357,"journal":{"name":"Proceedings of the ... International Joint Conference on Autonomous Agents and Multiagent Systems : AAMAS. International Joint Conference on Autonomous Agents and Multiagent Systems","volume":"2024 ","pages":"1482-1491"},"PeriodicalIF":0.0000,"publicationDate":"2024-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11460771/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... International Joint Conference on Autonomous Agents and Multiagent Systems : AAMAS. International Joint Conference on Autonomous Agents and Multiagent Systems","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/5/6 0:00:00","PubModel":"Epub","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Many important behavior changes are frictionful; they require individuals to expend effort over a long period with little immediate gratification. Here, an artificial intelligence (AI) agent can provide personalized interventions to help individuals stick to their goals. In these settings, the AI agent must personalize rapidly (before the individual disengages) and interpretably, to help us understand the behavioral interventions. In this paper, we introduce Behavior Model Reinforcement Learning (BMRL), a framework in which an AI agent intervenes on the parameters of a Markov Decision Process (MDP) belonging to a boundedly rational human agent. Our formulation of the human decision-maker as a planning agent allows us to attribute undesirable human policies (ones that do not lead to the goal) to their maladapted MDP parameters, such as an extremely low discount factor. Furthermore, we propose a class of tractable human models that captures fundamental behaviors in frictionful tasks. Introducing a notion of MDP equivalence specific to BMRL, we theoretically and empirically show that AI planning with our human models can lead to helpful policies on a wide range of more complex, ground-truth humans.

本刊更多论文

在摩擦任务中对有限理性的人类代理进行强化学习干预。

许多重要的行为改变都是摩擦性的；它们需要个人在很长一段时间内付出努力，却很少有立竿见影的效果。在这种情况下，人工智能（AI）代理可以提供个性化的干预措施，帮助个人坚持自己的目标。在这种情况下，人工智能代理必须快速（在个人脱离之前）、可解释地进行个性化干预，以帮助我们理解行为干预。在本文中，我们介绍了行为模型强化学习（BMRL），在这个框架中，人工智能代理对属于有界理性人类代理的马尔可夫决策过程（MDP）的参数进行干预。我们将人类决策者表述为一个规划代理，这使我们能够将不理想的人类政策（无法实现目标的政策）归因于其不适应的 MDP 参数，例如极低的贴现率。此外，我们还提出了一类易于理解的人类模型，可以捕捉摩擦任务中的基本行为。通过引入 BMRL 特有的 MDP 等效概念，我们从理论和经验上证明，使用我们的人类模型进行人工智能规划，可以为各种更复杂、更真实的人类提供有用的策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the ... International Joint Conference on Autonomous Agents and Multiagent Systems : AAMAS. International Joint Conference on Autonomous Agents and Multiagent Systems

自引率

0.00%

发文量