When is it acceptable to break the rules? Knowledge representation of moral judgements based on empirical data

IF 2.6 3区计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS

Autonomous Agents and Multi-Agent Systems Pub Date : 2024-07-13 DOI:10.1007/s10458-024-09667-4

Edmond Awad, Sydney Levine, Andrea Loreggia, Nicholas Mattei, Iyad Rahwan, Francesca Rossi, Kartik Talamadupula, Joshua Tenenbaum, Max Kleiman-Weiner

{"title":"When is it acceptable to break the rules? Knowledge representation of moral judgements based on empirical data","authors":"Edmond Awad, Sydney Levine, Andrea Loreggia, Nicholas Mattei, Iyad Rahwan, Francesca Rossi, Kartik Talamadupula, Joshua Tenenbaum, Max Kleiman-Weiner","doi":"10.1007/s10458-024-09667-4","DOIUrl":null,"url":null,"abstract":"<div><p>Constraining the actions of AI systems is one promising way to ensure that these systems behave in a way that is morally acceptable to humans. But constraints alone come with drawbacks as in many AI systems, they are not flexible. If these constraints are too rigid, they can preclude actions that are actually acceptable in certain, contextual situations. Humans, on the other hand, can often decide when a simple and seemingly inflexible rule should actually be overridden based on the context. In this paper, we empirically investigate the way humans make these contextual moral judgements, with the goal of building AI systems that understand when to follow and when to override constraints. We propose a novel and general preference-based graphical model that captures a modification of standard <i>dual process</i> theories of moral judgment. We then detail the design, implementation, and results of a study of human participants who judge whether it is acceptable to break a well-established rule: <i>no cutting in line</i>. We then develop an instance of our model and compare its performance to that of standard machine learning approaches on the task of predicting the behavior of human participants in the study, showing that our preference-based approach more accurately captures the judgments of human decision-makers. It also provides a flexible method to model the relationship between variables for moral decision-making tasks that can be generalized to other settings.</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"38 2","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10458-024-09667-4.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Autonomous Agents and Multi-Agent Systems","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10458-024-09667-4","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Constraining the actions of AI systems is one promising way to ensure that these systems behave in a way that is morally acceptable to humans. But constraints alone come with drawbacks as in many AI systems, they are not flexible. If these constraints are too rigid, they can preclude actions that are actually acceptable in certain, contextual situations. Humans, on the other hand, can often decide when a simple and seemingly inflexible rule should actually be overridden based on the context. In this paper, we empirically investigate the way humans make these contextual moral judgements, with the goal of building AI systems that understand when to follow and when to override constraints. We propose a novel and general preference-based graphical model that captures a modification of standard dual process theories of moral judgment. We then detail the design, implementation, and results of a study of human participants who judge whether it is acceptable to break a well-established rule: no cutting in line. We then develop an instance of our model and compare its performance to that of standard machine learning approaches on the task of predicting the behavior of human participants in the study, showing that our preference-based approach more accurately captures the judgments of human decision-makers. It also provides a flexible method to model the relationship between variables for moral decision-making tasks that can be generalized to other settings.

Abstract Image

查看原文本刊更多论文

什么时候打破规则是可以接受的？基于经验数据的道德判断知识表征

限制人工智能系统的行为是确保这些系统的行为在道德上为人类所接受的一种可行方法。但是，约束本身也有缺点，因为在许多人工智能系统中，约束并不灵活。如果这些约束过于死板，就会排除在某些特定情况下实际上可以接受的行为。另一方面，人类通常可以根据上下文来决定何时应该推翻看似不灵活的简单规则。在本文中，我们以经验为基础，研究了人类根据上下文做出道德判断的方式，目的是建立能够理解何时遵循、何时推翻约束的人工智能系统。我们提出了一种新颖而通用的基于偏好的图形模型，该模型捕捉了对道德判断标准双重过程理论的修改。然后，我们详细介绍了对人类参与者进行的一项研究的设计、实施和结果，人类参与者判断是否可以接受违反既定规则的行为：不插队。然后，我们开发了一个模型实例，并将其性能与标准机器学习方法在预测研究中人类参与者行为的任务上进行了比较，结果表明我们基于偏好的方法更准确地捕捉到了人类决策者的判断。它还为道德决策任务的变量之间的关系建模提供了一种灵活的方法，可以推广到其他场合。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Autonomous Agents and Multi-Agent Systems 工程技术-计算机：人工智能

CiteScore

6.00

自引率

5.30%

发文量

审稿时长

>12 weeks

期刊介绍： This is the official journal of the International Foundation for Autonomous Agents and Multi-Agent Systems. It provides a leading forum for disseminating significant original research results in the foundations, theory, development, analysis, and applications of autonomous agents and multi-agent systems. Coverage in Autonomous Agents and Multi-Agent Systems includes, but is not limited to: Agent decision-making architectures and their evaluation, including: cognitive models; knowledge representation; logics for agency; ontological reasoning; planning (single and multi-agent); reasoning (single and multi-agent) Cooperation and teamwork, including: distributed problem solving; human-robot/agent interaction; multi-user/multi-virtual-agent interaction; coalition formation; coordination Agent communication languages, including: their semantics, pragmatics, and implementation; agent communication protocols and conversations; agent commitments; speech act theory Ontologies for agent systems, agents and the semantic web, agents and semantic web services, Grid-based systems, and service-oriented computing Agent societies and societal issues, including: artificial social systems; environments, organizations and institutions; ethical and legal issues; privacy, safety and security; trust, reliability and reputation Agent-based system development, including: agent development techniques, tools and environments; agent programming languages; agent specification or validation languages Agent-based simulation, including: emergent behavior; participatory simulation; simulation techniques, tools and environments; social simulation Agreement technologies, including: argumentation; collective decision making; judgment aggregation and belief merging; negotiation; norms Economic paradigms, including: auction and mechanism design; bargaining and negotiation; economically-motivated agents; game theory (cooperative and non-cooperative); social choice and voting Learning agents, including: computational architectures for learning agents; evolution, adaptation; multi-agent learning. Robotic agents, including: integrated perception, cognition, and action; cognitive robotics; robot planning (including action and motion planning); multi-robot systems. Virtual agents, including: agents in games and virtual environments; companion and coaching agents; modeling personality, emotions; multimodal interaction; verbal and non-verbal expressiveness Significant, novel applications of agent technology Comprehensive reviews and authoritative tutorials of research and practice in agent systems Comprehensive and authoritative reviews of books dealing with agents and multi-agent systems.