Norm Augmented Reinforcement Learning Agents With Synthesized Normative Rules

IF 0.7 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of Cases on Information Technology Pub Date : 2024-07-16 DOI:10.4018/jcit.345650

Mohd Rashdan Abdul Kadir, Ali Selamat, Ondrej Krejcar

{"title":"Norm Augmented Reinforcement Learning Agents With Synthesized Normative Rules","authors":"Mohd Rashdan Abdul Kadir, Ali Selamat, Ondrej Krejcar","doi":"10.4018/jcit.345650","DOIUrl":null,"url":null,"abstract":"The dynamic deontic (DD) is a norm synthesis framework that extracts normative rules from reinforcement learning (RL), however it was not designed to be applied in agent coordination. This study proposes a norm augmented reinforcement learning framework (NARLF) that extends said model to include a norm deliberation mechanism for learned norms re-imputation for norm biased decision-making RL agents. This study aims to test the effects of synthesized norms applied on-line and off-line on agent learning performance. The framework consists of the DD framework extended with a pre-processing and deliberation component to allow re-imputation of normative rules. A deliberation model, the Norm Augmented Q-Table (NAugQT), is proposed to map normative rules into RL agents via q-values weight updates. Results show that the framework is able to map and improve RL agent's performance but only when synthesized off-line edited absolute norm salience value norms are used. This shows limitations when unstable salience norms are applied. Improvement in norm extraction and pre-processing are required.","PeriodicalId":43384,"journal":{"name":"Journal of Cases on Information Technology","volume":null,"pages":null},"PeriodicalIF":0.7000,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cases on Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/jcit.345650","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

The dynamic deontic (DD) is a norm synthesis framework that extracts normative rules from reinforcement learning (RL), however it was not designed to be applied in agent coordination. This study proposes a norm augmented reinforcement learning framework (NARLF) that extends said model to include a norm deliberation mechanism for learned norms re-imputation for norm biased decision-making RL agents. This study aims to test the effects of synthesized norms applied on-line and off-line on agent learning performance. The framework consists of the DD framework extended with a pre-processing and deliberation component to allow re-imputation of normative rules. A deliberation model, the Norm Augmented Q-Table (NAugQT), is proposed to map normative rules into RL agents via q-values weight updates. Results show that the framework is able to map and improve RL agent's performance but only when synthesized off-line edited absolute norm salience value norms are used. This shows limitations when unstable salience norms are applied. Improvement in norm extraction and pre-processing are required.

查看原文本刊更多论文

具有合成规范规则的规范增强强化学习代理

动态规范（DD）是一种规范合成框架，可从强化学习（RL）中提取规范规则，但其设计初衷并非用于代理协调。本研究提出了一种规范增强强化学习框架（NARLF），它对上述模型进行了扩展，纳入了一种规范审议机制，用于为有规范偏差的决策 RL 代理重新输入所学规范。本研究旨在测试在线和离线合成规范对代理学习绩效的影响。该框架由 DD 框架扩展而成，其中包含预处理和审议组件，允许重新输入规范规则。我们提出了一个审议模型，即规范增强 Q 表（NAugQT），通过 Q 值权重更新将规范规则映射到 RL 代理中。结果表明，该框架能够映射并提高 RL 代理的性能，但仅限于使用离线合成的绝对规范突出值规范时。当使用不稳定的显著性准则时，这就显示出了局限性。需要改进规范提取和预处理。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Cases on Information Technology COMPUTER SCIENCE, INFORMATION SYSTEMS-

CiteScore

2.60

自引率

0.00%

发文量

期刊介绍： JCIT documents comprehensive, real-life cases based on individual, organizational and societal experiences related to the utilization and management of information technology. Cases published in JCIT deal with a wide variety of organizations such as businesses, government organizations, educational institutions, libraries, non-profit organizations. Additionally, cases published in JCIT report not only successful utilization of IT applications, but also failures and mismanagement of IT resources and applications.