激励响应,工具控制和影响

IF 4.6 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Ryan Carey , Eric Langlois , Chris van Merwijk , Shane Legg , Tom Everitt
{"title":"激励响应,工具控制和影响","authors":"Ryan Carey ,&nbsp;Eric Langlois ,&nbsp;Chris van Merwijk ,&nbsp;Shane Legg ,&nbsp;Tom Everitt","doi":"10.1016/j.artint.2025.104408","DOIUrl":null,"url":null,"abstract":"<div><div>We introduce three concepts that describe an agent's incentives: response incentives indicate which variables in the environment, such as sensitive demographic information, affect the decision under the optimal policy. Instrumental control incentives indicate whether an agent's policy is chosen to manipulate part of its environment, such as the preferences or instructions of a user. Impact incentives indicate which variables an agent will affect, intentionally or otherwise. For each concept, we establish sound and complete graphical criteria, and discuss general classes of techniques that may be used to produce incentives for safe and fair agent behaviour. Finally, we outline how these notions may be generalised to multi-decision settings.</div><div>This journal paper extends our conference publication “Agent Incentives: A Causal Perspective”: the material on response incentives and instrumental control incentives is updated, while the work on impact incentives and multi-decision settings is entirely new.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"348 ","pages":"Article 104408"},"PeriodicalIF":4.6000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Incentives for responsiveness, instrumental control and impact\",\"authors\":\"Ryan Carey ,&nbsp;Eric Langlois ,&nbsp;Chris van Merwijk ,&nbsp;Shane Legg ,&nbsp;Tom Everitt\",\"doi\":\"10.1016/j.artint.2025.104408\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>We introduce three concepts that describe an agent's incentives: response incentives indicate which variables in the environment, such as sensitive demographic information, affect the decision under the optimal policy. Instrumental control incentives indicate whether an agent's policy is chosen to manipulate part of its environment, such as the preferences or instructions of a user. Impact incentives indicate which variables an agent will affect, intentionally or otherwise. For each concept, we establish sound and complete graphical criteria, and discuss general classes of techniques that may be used to produce incentives for safe and fair agent behaviour. Finally, we outline how these notions may be generalised to multi-decision settings.</div><div>This journal paper extends our conference publication “Agent Incentives: A Causal Perspective”: the material on response incentives and instrumental control incentives is updated, while the work on impact incentives and multi-decision settings is entirely new.</div></div>\",\"PeriodicalId\":8434,\"journal\":{\"name\":\"Artificial Intelligence\",\"volume\":\"348 \",\"pages\":\"Article 104408\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0004370225001274\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0004370225001274","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

我们引入了描述agent激励的三个概念:响应激励表明环境中的哪些变量,如敏感的人口统计信息,会影响最优策略下的决策;工具控制激励指示代理是否选择策略来操纵其环境的一部分,例如用户的偏好或指令。影响激励表明代理人有意或无意地影响哪些变量。对于每个概念,我们建立了健全和完整的图形标准,并讨论了可用于产生安全和公平代理行为激励的一般技术类别。最后,我们概述了如何将这些概念推广到多决策设置。这篇期刊论文扩展了我们的会议出版物“Agent Incentives: A Causal Perspective”:更新了关于响应激励和工具控制激励的材料,而关于影响激励和多决策设置的工作则是全新的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Incentives for responsiveness, instrumental control and impact
We introduce three concepts that describe an agent's incentives: response incentives indicate which variables in the environment, such as sensitive demographic information, affect the decision under the optimal policy. Instrumental control incentives indicate whether an agent's policy is chosen to manipulate part of its environment, such as the preferences or instructions of a user. Impact incentives indicate which variables an agent will affect, intentionally or otherwise. For each concept, we establish sound and complete graphical criteria, and discuss general classes of techniques that may be used to produce incentives for safe and fair agent behaviour. Finally, we outline how these notions may be generalised to multi-decision settings.
This journal paper extends our conference publication “Agent Incentives: A Causal Perspective”: the material on response incentives and instrumental control incentives is updated, while the work on impact incentives and multi-decision settings is entirely new.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Artificial Intelligence
Artificial Intelligence 工程技术-计算机:人工智能
CiteScore
11.20
自引率
1.40%
发文量
118
审稿时长
8 months
期刊介绍: The Journal of Artificial Intelligence (AIJ) welcomes papers covering a broad spectrum of AI topics, including cognition, automated reasoning, computer vision, machine learning, and more. Papers should demonstrate advancements in AI and propose innovative approaches to AI problems. Additionally, the journal accepts papers describing AI applications, focusing on how new methods enhance performance rather than reiterating conventional approaches. In addition to regular papers, AIJ also accepts Research Notes, Research Field Reviews, Position Papers, Book Reviews, and summary papers on AI challenges and competitions.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信