Information Acquisition Driven by Reinforcement in Non-Deterministic Environments

American Journal of Trade and Policy Pub Date : 2019-12-31 DOI:10.18034/ajtp.v6i3.569

N. Bynagari, Ruhul Amin

引用次数: 11

Abstract

What is the fastest way for an agent living in a non-deterministic Markov environment (NME) to learn about its statistical properties? The answer is to create "optimal" experiment sequences by carrying out action sequences that maximize expected knowledge gain. This idea is put into practice by integrating information theory and reinforcement learning techniques. Experiments demonstrate that the resulting method, reinforcement-driven information acquisition (RDIA), is substantially faster than standard random exploration for exploring particular NMEs. Exploration was studied apart from exploitation and we evaluated the performance of different reinforcement-driven information acquisition variations to that of traditional random exploration.

查看原文本刊更多论文

非确定性环境下强化驱动的信息获取

对于生活在非确定性马尔可夫环境(NME)中的智能体来说，了解其统计特性的最快方法是什么?答案是通过执行能够最大化预期知识增益的动作序列来创建“最佳”实验序列。这个想法是通过整合信息理论和强化学习技术来实现的。实验表明，所得到的强化驱动信息获取(RDIA)方法在探索特定nme时比标准随机探索要快得多。在挖掘的基础上，研究了不同强化驱动信息获取变量对传统随机探索的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

American Journal of Trade and Policy

自引率

0.00%

发文量