具有合成规范规则的规范增强强化学习代理

Pub Date : 2024-07-16 DOI:10.4018/jcit.345650
Mohd Rashdan Abdul Kadir, Ali Selamat, Ondrej Krejcar
{"title":"具有合成规范规则的规范增强强化学习代理","authors":"Mohd Rashdan Abdul Kadir, Ali Selamat, Ondrej Krejcar","doi":"10.4018/jcit.345650","DOIUrl":null,"url":null,"abstract":"The dynamic deontic (DD) is a norm synthesis framework that extracts normative rules from reinforcement learning (RL), however it was not designed to be applied in agent coordination. This study proposes a norm augmented reinforcement learning framework (NARLF) that extends said model to include a norm deliberation mechanism for learned norms re-imputation for norm biased decision-making RL agents. This study aims to test the effects of synthesized norms applied on-line and off-line on agent learning performance. The framework consists of the DD framework extended with a pre-processing and deliberation component to allow re-imputation of normative rules. A deliberation model, the Norm Augmented Q-Table (NAugQT), is proposed to map normative rules into RL agents via q-values weight updates. Results show that the framework is able to map and improve RL agent's performance but only when synthesized off-line edited absolute norm salience value norms are used. This shows limitations when unstable salience norms are applied. Improvement in norm extraction and pre-processing are required.","PeriodicalId":0,"journal":{"name":"","volume":"2 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Norm Augmented Reinforcement Learning Agents With Synthesized Normative Rules\",\"authors\":\"Mohd Rashdan Abdul Kadir, Ali Selamat, Ondrej Krejcar\",\"doi\":\"10.4018/jcit.345650\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The dynamic deontic (DD) is a norm synthesis framework that extracts normative rules from reinforcement learning (RL), however it was not designed to be applied in agent coordination. This study proposes a norm augmented reinforcement learning framework (NARLF) that extends said model to include a norm deliberation mechanism for learned norms re-imputation for norm biased decision-making RL agents. This study aims to test the effects of synthesized norms applied on-line and off-line on agent learning performance. The framework consists of the DD framework extended with a pre-processing and deliberation component to allow re-imputation of normative rules. A deliberation model, the Norm Augmented Q-Table (NAugQT), is proposed to map normative rules into RL agents via q-values weight updates. Results show that the framework is able to map and improve RL agent's performance but only when synthesized off-line edited absolute norm salience value norms are used. This shows limitations when unstable salience norms are applied. Improvement in norm extraction and pre-processing are required.\",\"PeriodicalId\":0,\"journal\":{\"name\":\"\",\"volume\":\"2 12\",\"pages\":\"\"},\"PeriodicalIF\":0.0,\"publicationDate\":\"2024-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/jcit.345650\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/jcit.345650","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

动态规范(DD)是一种规范合成框架,可从强化学习(RL)中提取规范规则,但其设计初衷并非用于代理协调。本研究提出了一种规范增强强化学习框架(NARLF),它对上述模型进行了扩展,纳入了一种规范审议机制,用于为有规范偏差的决策 RL 代理重新输入所学规范。本研究旨在测试在线和离线合成规范对代理学习绩效的影响。该框架由 DD 框架扩展而成,其中包含预处理和审议组件,允许重新输入规范规则。我们提出了一个审议模型,即规范增强 Q 表(NAugQT),通过 Q 值权重更新将规范规则映射到 RL 代理中。结果表明,该框架能够映射并提高 RL 代理的性能,但仅限于使用离线合成的绝对规范突出值规范时。当使用不稳定的显著性准则时,这就显示出了局限性。需要改进规范提取和预处理。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
分享
查看原文
Norm Augmented Reinforcement Learning Agents With Synthesized Normative Rules
The dynamic deontic (DD) is a norm synthesis framework that extracts normative rules from reinforcement learning (RL), however it was not designed to be applied in agent coordination. This study proposes a norm augmented reinforcement learning framework (NARLF) that extends said model to include a norm deliberation mechanism for learned norms re-imputation for norm biased decision-making RL agents. This study aims to test the effects of synthesized norms applied on-line and off-line on agent learning performance. The framework consists of the DD framework extended with a pre-processing and deliberation component to allow re-imputation of normative rules. A deliberation model, the Norm Augmented Q-Table (NAugQT), is proposed to map normative rules into RL agents via q-values weight updates. Results show that the framework is able to map and improve RL agent's performance but only when synthesized off-line edited absolute norm salience value norms are used. This shows limitations when unstable salience norms are applied. Improvement in norm extraction and pre-processing are required.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信