线性时序逻辑和稳态规范的控制器合成

IF 2.6 3区计算机科学 Q3 AUTOMATION & CONTROL SYSTEMS

Autonomous Agents and Multi-Agent Systems Pub Date : 2024-05-03 DOI:10.1007/s10458-024-09648-7

Alvaro Velasquez, Ismail Alkhouri, Andre Beckus, Ashutosh Trivedi, George Atia

{"title":"线性时序逻辑和稳态规范的控制器合成","authors":"Alvaro Velasquez, Ismail Alkhouri, Andre Beckus, Ashutosh Trivedi, George Atia","doi":"10.1007/s10458-024-09648-7","DOIUrl":null,"url":null,"abstract":"<div><p>The problem of deriving decision-making policies, subject to some formal specification of behavior, has been well-studied in the control synthesis, reinforcement learning, and planning communities. Such problems are typically framed in the context of a non-deterministic decision process, the non-determinism of which is optimally resolved by the computed policy. In this paper, we explore the derivation of such policies in Markov decision processes (MDPs) subject to two types of formal specifications. First, we consider steady-state specifications that reason about the infinite-frequency behavior of the resulting agent. This behavior corresponds to the frequency with which an agent visits each state as it follows its decision-making policy indefinitely. Second, we examine the infinite-trace behavior of the agent by imposing Linear Temporal Logic (LTL) constraints on the behavior induced by the resulting policy. We present an algorithm to find a deterministic policy satisfying LTL and steady-state constraints by characterizing the solutions as an integer linear program (ILP) and experimentally evaluate our approach. In our experimental results section, we evaluate the proposed ILP using MDPs with stochastic and deterministic transitions.\n</p></div>","PeriodicalId":55586,"journal":{"name":"Autonomous Agents and Multi-Agent Systems","volume":"38 1","pages":""},"PeriodicalIF":2.6000,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Controller synthesis for linear temporal logic and steady-state specifications\",\"authors\":\"Alvaro Velasquez, Ismail Alkhouri, Andre Beckus, Ashutosh Trivedi, George Atia\",\"doi\":\"10.1007/s10458-024-09648-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>The problem of deriving decision-making policies, subject to some formal specification of behavior, has been well-studied in the control synthesis, reinforcement learning, and planning communities. Such problems are typically framed in the context of a non-deterministic decision process, the non-determinism of which is optimally resolved by the computed policy. In this paper, we explore the derivation of such policies in Markov decision processes (MDPs) subject to two types of formal specifications. First, we consider steady-state specifications that reason about the infinite-frequency behavior of the resulting agent. This behavior corresponds to the frequency with which an agent visits each state as it follows its decision-making policy indefinitely. Second, we examine the infinite-trace behavior of the agent by imposing Linear Temporal Logic (LTL) constraints on the behavior induced by the resulting policy. We present an algorithm to find a deterministic policy satisfying LTL and steady-state constraints by characterizing the solutions as an integer linear program (ILP) and experimentally evaluate our approach. In our experimental results section, we evaluate the proposed ILP using MDPs with stochastic and deterministic transitions.\\n</p></div>\",\"PeriodicalId\":55586,\"journal\":{\"name\":\"Autonomous Agents and Multi-Agent Systems\",\"volume\":\"38 1\",\"pages\":\"\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2024-05-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Autonomous Agents and Multi-Agent Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10458-024-09648-7\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Autonomous Agents and Multi-Agent Systems","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10458-024-09648-7","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

控制综合、强化学习和规划界已经对根据某种正式的行为规范推导决策策略的问题进行了深入研究。这类问题通常是在非确定性决策过程的背景下提出的，其非确定性由计算出的政策最优化解决。在本文中，我们将探讨如何在马尔可夫决策过程（MDP）中根据两类形式规范推导出此类策略。首先，我们考虑的是稳态规范，这种规范会推理出代理的无限频率行为。这种行为对应于代理在无限期遵循其决策策略时访问每个状态的频率。其次，我们通过对由此产生的策略所诱导的行为施加线性时间逻辑（LTL）约束，来检验代理的无限轨迹行为。我们提出了一种算法，通过将解决方案表征为整数线性规划（ILP）来找到满足 LTL 和稳态约束的确定性策略，并对我们的方法进行了实验评估。在实验结果部分，我们使用具有随机和确定过渡的 MDP 对所提出的 ILP 进行了评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Controller synthesis for linear temporal logic and steady-state specifications

查看原文本刊更多论文

Controller synthesis for linear temporal logic and steady-state specifications

The problem of deriving decision-making policies, subject to some formal specification of behavior, has been well-studied in the control synthesis, reinforcement learning, and planning communities. Such problems are typically framed in the context of a non-deterministic decision process, the non-determinism of which is optimally resolved by the computed policy. In this paper, we explore the derivation of such policies in Markov decision processes (MDPs) subject to two types of formal specifications. First, we consider steady-state specifications that reason about the infinite-frequency behavior of the resulting agent. This behavior corresponds to the frequency with which an agent visits each state as it follows its decision-making policy indefinitely. Second, we examine the infinite-trace behavior of the agent by imposing Linear Temporal Logic (LTL) constraints on the behavior induced by the resulting policy. We present an algorithm to find a deterministic policy satisfying LTL and steady-state constraints by characterizing the solutions as an integer linear program (ILP) and experimentally evaluate our approach. In our experimental results section, we evaluate the proposed ILP using MDPs with stochastic and deterministic transitions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Autonomous Agents and Multi-Agent Systems 工程技术-计算机：人工智能

CiteScore

6.00

自引率

5.30%

发文量

审稿时长

>12 weeks

期刊介绍： This is the official journal of the International Foundation for Autonomous Agents and Multi-Agent Systems. It provides a leading forum for disseminating significant original research results in the foundations, theory, development, analysis, and applications of autonomous agents and multi-agent systems. Coverage in Autonomous Agents and Multi-Agent Systems includes, but is not limited to: Agent decision-making architectures and their evaluation, including: cognitive models; knowledge representation; logics for agency; ontological reasoning; planning (single and multi-agent); reasoning (single and multi-agent) Cooperation and teamwork, including: distributed problem solving; human-robot/agent interaction; multi-user/multi-virtual-agent interaction; coalition formation; coordination Agent communication languages, including: their semantics, pragmatics, and implementation; agent communication protocols and conversations; agent commitments; speech act theory Ontologies for agent systems, agents and the semantic web, agents and semantic web services, Grid-based systems, and service-oriented computing Agent societies and societal issues, including: artificial social systems; environments, organizations and institutions; ethical and legal issues; privacy, safety and security; trust, reliability and reputation Agent-based system development, including: agent development techniques, tools and environments; agent programming languages; agent specification or validation languages Agent-based simulation, including: emergent behavior; participatory simulation; simulation techniques, tools and environments; social simulation Agreement technologies, including: argumentation; collective decision making; judgment aggregation and belief merging; negotiation; norms Economic paradigms, including: auction and mechanism design; bargaining and negotiation; economically-motivated agents; game theory (cooperative and non-cooperative); social choice and voting Learning agents, including: computational architectures for learning agents; evolution, adaptation; multi-agent learning. Robotic agents, including: integrated perception, cognition, and action; cognitive robotics; robot planning (including action and motion planning); multi-robot systems. Virtual agents, including: agents in games and virtual environments; companion and coaching agents; modeling personality, emotions; multimodal interaction; verbal and non-verbal expressiveness Significant, novel applications of agent technology Comprehensive reviews and authoritative tutorials of research and practice in agent systems Comprehensive and authoritative reviews of books dealing with agents and multi-agent systems.