MIXRTs: Toward Interpretable Multi-Agent Reinforcement Learning via Mixing Recurrent Soft Decision Trees

IEEE transactions on pattern analysis and machine intelligence Pub Date : 2025-02-11 DOI:10.1109/TPAMI.2025.3540467

Zichuan Liu;Yuanyang Zhu;Zhi Wang;Yang Gao;Chunlin Chen

{"title":"MIXRTs: Toward Interpretable Multi-Agent Reinforcement Learning via Mixing Recurrent Soft Decision Trees","authors":"Zichuan Liu;Yuanyang Zhu;Zhi Wang;Yang Gao;Chunlin Chen","doi":"10.1109/TPAMI.2025.3540467","DOIUrl":null,"url":null,"abstract":"While achieving tremendous success in various fields, existing multi-agent reinforcement learning (MARL) with a black-box neural network makes decisions in an opaque manner that hinders humans from understanding the learned knowledge and how input observations influence decisions. In contrast, existing interpretable approaches usually suffer from weak expressivity and low performance. To bridge this gap, we propose MIXing Recurrent soft decision Trees (MIXRTs), a novel interpretable architecture that can represent explicit decision processes via the root-to-leaf path and reflect each agent’s contribution to the team. Specifically, we construct a novel soft decision tree using a recurrent structure and demonstrate which features influence the decision-making process. Then, based on the value decomposition framework, we linearly assign credit to each agent by explicitly mixing individual action values to estimate the joint action value using only local observations, providing new insights into interpreting the cooperation mechanism. Theoretical analysis confirms that MIXRTs guarantee additivity and monotonicity in the factorization of joint action values. Evaluations on complex tasks like Spread and StarCraft II demonstrate that MIXRTs compete with existing methods while providing clear explanations, paving the way for interpretable and high-performing MARL systems.","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"47 5","pages":"4090-4107"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10879295/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

While achieving tremendous success in various fields, existing multi-agent reinforcement learning (MARL) with a black-box neural network makes decisions in an opaque manner that hinders humans from understanding the learned knowledge and how input observations influence decisions. In contrast, existing interpretable approaches usually suffer from weak expressivity and low performance. To bridge this gap, we propose MIXing Recurrent soft decision Trees (MIXRTs), a novel interpretable architecture that can represent explicit decision processes via the root-to-leaf path and reflect each agent’s contribution to the team. Specifically, we construct a novel soft decision tree using a recurrent structure and demonstrate which features influence the decision-making process. Then, based on the value decomposition framework, we linearly assign credit to each agent by explicitly mixing individual action values to estimate the joint action value using only local observations, providing new insights into interpreting the cooperation mechanism. Theoretical analysis confirms that MIXRTs guarantee additivity and monotonicity in the factorization of joint action values. Evaluations on complex tasks like Spread and StarCraft II demonstrate that MIXRTs compete with existing methods while providing clear explanations, paving the way for interpretable and high-performing MARL systems.

查看原文本刊更多论文

MIXRTs：通过混合循环软决策树实现可解释的多智能体强化学习

现有的基于黑箱神经网络的多智能体强化学习（MARL）在各个领域取得巨大成功的同时，以不透明的方式做出决策，阻碍了人类理解学习到的知识以及输入观察值如何影响决策。相比之下，现有的可解释方法通常具有较弱的表达能力和较低的性能。为了弥补这一差距，我们提出了混合循环软决策树（MIXRTs），这是一种新的可解释架构，可以通过根到叶的路径表示明确的决策过程，并反映每个智能体对团队的贡献。具体而言，我们使用循环结构构造了一种新的软决策树，并演示了哪些特征影响决策过程。然后，在价值分解框架的基础上，我们通过明确混合个体行动值来线性分配每个代理的信用，仅使用局部观察来估计联合行动值，为解释合作机制提供了新的见解。理论分析证实了MIXRTs在联合作用值分解时保证了可加性和单调性。对像扩散和星际争霸II这样的复杂任务的评估表明，MIXRTs在提供清晰解释的同时与现有方法竞争，为可解释和高性能的MARL系统铺平了道路。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on pattern analysis and machine intelligence

自引率

0.00%

发文量