从耗散理论到有限马尔可夫决策过程的组合构造

Abolfazl Lavaei, S. Soudjani, Majid Zamani
{"title":"从耗散理论到有限马尔可夫决策过程的组合构造","authors":"Abolfazl Lavaei, S. Soudjani, Majid Zamani","doi":"10.1145/3178126.3178135","DOIUrl":null,"url":null,"abstract":"This paper is concerned with a compositional approach for constructing finite Markov decision processes of interconnected discrete-time stochastic control systems. The proposed approach leverages the interconnection topology and a notion of so-called stochastic storage functions describing joint dissipativity-type properties of subsystems and their abstractions. In the first part of the paper, we derive dissipativity-type compositional conditions for quantifying the error between the interconnection of stochastic control subsystems and that of their abstractions. In the second part of the paper, we propose an approach to construct finite Markov decision processes together with their corresponding stochastic storage functions for classes of discrete-time control systems satisfying some incremental passivablity property. Under this property, one can construct finite Markov decision processes by a suitable discretization of the input and state sets. Moreover, we show that for linear stochastic control systems, the aforementioned property can be readily checked by some matrix inequality. We apply our proposed results to the temperature regulation in a circular building by constructing compositionally a finite Markov decision process of a network containing 200 rooms in which the compositionality condition does not require any constraint on the number or gains of the subsystems. We employ the constructed finite Markov decision process as a substitute to synthesize policies regulating the temperature in each room for a bounded time horizon. We also illustrate the effectiveness of our results on an example of fully connected network.","PeriodicalId":131076,"journal":{"name":"Proceedings of the 21st International Conference on Hybrid Systems: Computation and Control (part of CPS Week)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"38","resultStr":"{\"title\":\"From Dissipativity Theory to Compositional Construction of Finite Markov Decision Processes\",\"authors\":\"Abolfazl Lavaei, S. Soudjani, Majid Zamani\",\"doi\":\"10.1145/3178126.3178135\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper is concerned with a compositional approach for constructing finite Markov decision processes of interconnected discrete-time stochastic control systems. The proposed approach leverages the interconnection topology and a notion of so-called stochastic storage functions describing joint dissipativity-type properties of subsystems and their abstractions. In the first part of the paper, we derive dissipativity-type compositional conditions for quantifying the error between the interconnection of stochastic control subsystems and that of their abstractions. In the second part of the paper, we propose an approach to construct finite Markov decision processes together with their corresponding stochastic storage functions for classes of discrete-time control systems satisfying some incremental passivablity property. Under this property, one can construct finite Markov decision processes by a suitable discretization of the input and state sets. Moreover, we show that for linear stochastic control systems, the aforementioned property can be readily checked by some matrix inequality. We apply our proposed results to the temperature regulation in a circular building by constructing compositionally a finite Markov decision process of a network containing 200 rooms in which the compositionality condition does not require any constraint on the number or gains of the subsystems. We employ the constructed finite Markov decision process as a substitute to synthesize policies regulating the temperature in each room for a bounded time horizon. We also illustrate the effectiveness of our results on an example of fully connected network.\",\"PeriodicalId\":131076,\"journal\":{\"name\":\"Proceedings of the 21st International Conference on Hybrid Systems: Computation and Control (part of CPS Week)\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"38\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 21st International Conference on Hybrid Systems: Computation and Control (part of CPS Week)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3178126.3178135\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st International Conference on Hybrid Systems: Computation and Control (part of CPS Week)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3178126.3178135","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 38

摘要

本文研究了一种构造互联离散随机控制系统有限马尔可夫决策过程的组合方法。所提出的方法利用互连拓扑和所谓的随机存储函数的概念,描述子系统及其抽象的联合耗散类型属性。在本文的第一部分中,我们导出了用于量化随机控制子系统互连与抽象子系统互连误差的耗散型组合条件。在论文的第二部分,我们提出了一类满足增量无源性的离散时间控制系统的有限马尔可夫决策过程及其相应的随机存储函数的构造方法。根据这一性质,可以通过适当的离散化输入集和状态集来构造有限马尔可夫决策过程。此外,我们还证明了对于线性随机控制系统,上述性质可以很容易地用一些矩阵不等式来检验。我们通过构造包含200个房间的网络的组合有限马尔可夫决策过程,将我们提出的结果应用于圆形建筑的温度调节,其中组合性条件不需要对子系统的数量或增益进行任何约束。我们使用构造的有限马尔可夫决策过程作为替代,在有限的时间范围内合成调节每个房间温度的策略。我们还在一个全连接网络的例子上说明了我们的结果的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
From Dissipativity Theory to Compositional Construction of Finite Markov Decision Processes
This paper is concerned with a compositional approach for constructing finite Markov decision processes of interconnected discrete-time stochastic control systems. The proposed approach leverages the interconnection topology and a notion of so-called stochastic storage functions describing joint dissipativity-type properties of subsystems and their abstractions. In the first part of the paper, we derive dissipativity-type compositional conditions for quantifying the error between the interconnection of stochastic control subsystems and that of their abstractions. In the second part of the paper, we propose an approach to construct finite Markov decision processes together with their corresponding stochastic storage functions for classes of discrete-time control systems satisfying some incremental passivablity property. Under this property, one can construct finite Markov decision processes by a suitable discretization of the input and state sets. Moreover, we show that for linear stochastic control systems, the aforementioned property can be readily checked by some matrix inequality. We apply our proposed results to the temperature regulation in a circular building by constructing compositionally a finite Markov decision process of a network containing 200 rooms in which the compositionality condition does not require any constraint on the number or gains of the subsystems. We employ the constructed finite Markov decision process as a substitute to synthesize policies regulating the temperature in each room for a bounded time horizon. We also illustrate the effectiveness of our results on an example of fully connected network.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信