Identifying tractable decentralized control problems on the basis of information structure

Aditya Mahajan, A. Nayyar, D. Teneketzis
{"title":"Identifying tractable decentralized control problems on the basis of information structure","authors":"Aditya Mahajan, A. Nayyar, D. Teneketzis","doi":"10.1109/ALLERTON.2008.4797732","DOIUrl":null,"url":null,"abstract":"Sequential decomposition of two general models of decentralized systems with non-classical information structures is presented. In model A, all agents have two observations at each step: a common observation that all agents observe and a private observation of their own. The control actions of each agent is based on all past common observations, the current private observation and the contents of its memory. At each step, each agent also updates the contents of its memory. A cost function, which depends on the state of the plant and the control actions of all agents, is given. The objective is to choose control and memory update functions for all agents to either minimize a total expected cost over a finite horizon or to minimize a discounted cost over an infinite horizon. In model B, the agents do not have any common observation, the rest is same as in model A. The key idea of our solution methodology is the following. From the point of view of a fictitious agent that observes all common observations, the system can be viewed as a centralized system with partial observations. This allows us to identify information states and obtain a sequential decomposition. When the system variables take values in finite sets, the optimality equations of the sequential decomposition are similar to those of partially observable Markov decision processes (POMDP) with finite state and action spaces. For such systems, we can use algorithms for POMDPs to compute optimal designs for models A and B.","PeriodicalId":120561,"journal":{"name":"2008 46th Annual Allerton Conference on Communication, Control, and Computing","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"46","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 46th Annual Allerton Conference on Communication, Control, and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ALLERTON.2008.4797732","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 46

Abstract

Sequential decomposition of two general models of decentralized systems with non-classical information structures is presented. In model A, all agents have two observations at each step: a common observation that all agents observe and a private observation of their own. The control actions of each agent is based on all past common observations, the current private observation and the contents of its memory. At each step, each agent also updates the contents of its memory. A cost function, which depends on the state of the plant and the control actions of all agents, is given. The objective is to choose control and memory update functions for all agents to either minimize a total expected cost over a finite horizon or to minimize a discounted cost over an infinite horizon. In model B, the agents do not have any common observation, the rest is same as in model A. The key idea of our solution methodology is the following. From the point of view of a fictitious agent that observes all common observations, the system can be viewed as a centralized system with partial observations. This allows us to identify information states and obtain a sequential decomposition. When the system variables take values in finite sets, the optimality equations of the sequential decomposition are similar to those of partially observable Markov decision processes (POMDP) with finite state and action spaces. For such systems, we can use algorithms for POMDPs to compute optimal designs for models A and B.
基于信息结构识别可处理的分散控制问题
给出了具有非经典信息结构的分散系统的两种一般模型的顺序分解。在模型A中,所有智能体在每一步都有两个观察结果:一个是所有智能体观察到的共同观察结果,另一个是它们自己的私人观察结果。每个智能体的控制动作基于所有过去的共同观察、当前的私有观察和其内存的内容。在每一步中,每个代理还更新其内存的内容。给出了一个成本函数,它依赖于工厂的状态和所有代理的控制动作。目标是为所有智能体选择控制和记忆更新函数,以最小化有限范围内的总期望成本或最小化无限范围内的折扣成本。在模型B中,代理没有任何共同观察,其余部分与模型a相同。我们的解决方法的关键思想如下。从观察所有共同观察的虚拟代理的角度来看,系统可以被视为具有部分观察的集中系统。这允许我们识别信息状态并获得顺序分解。当系统变量取值为有限集合时,序列分解的最优性方程类似于有限状态和有限动作空间的部分可观察马尔可夫决策过程的最优性方程。对于这样的系统,我们可以使用pomdp算法来计算模型A和B的最优设计。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信