Synthesizing Evolving Symbolic Representations for Autonomous Systems

arXiv - CS - Symbolic Computation Pub Date : 2024-09-18 DOI:arxiv-2409.11756

Gabriele Sartor, Angelo Oddi, Riccardo Rasconi, Vieri Giuliano Santucci, Rosa Meo

{"title":"Synthesizing Evolving Symbolic Representations for Autonomous Systems","authors":"Gabriele Sartor, Angelo Oddi, Riccardo Rasconi, Vieri Giuliano Santucci, Rosa Meo","doi":"arxiv-2409.11756","DOIUrl":null,"url":null,"abstract":"Recently, AI systems have made remarkable progress in various tasks. Deep\nReinforcement Learning(DRL) is an effective tool for agents to learn policies\nin low-level state spaces to solve highly complex tasks. Researchers have\nintroduced Intrinsic Motivation(IM) to the RL mechanism, which simulates the\nagent's curiosity, encouraging agents to explore interesting areas of the\nenvironment. This new feature has proved vital in enabling agents to learn\npolicies without being given specific goals. However, even though DRL\nintelligence emerges through a sub-symbolic model, there is still a need for a\nsort of abstraction to understand the knowledge collected by the agent. To this\nend, the classical planning formalism has been used in recent research to\nexplicitly represent the knowledge an autonomous agent acquires and effectively\nreach extrinsic goals. Despite classical planning usually presents limited\nexpressive capabilities, PPDDL demonstrated usefulness in reviewing the\nknowledge gathered by an autonomous system, making explicit causal\ncorrelations, and can be exploited to find a plan to reach any state the agent\nfaces during its experience. This work presents a new architecture implementing\nan open-ended learning system able to synthesize from scratch its experience\ninto a PPDDL representation and update it over time. Without a predefined set\nof goals and tasks, the system integrates intrinsic motivations to explore the\nenvironment in a self-directed way, exploiting the high-level knowledge\nacquired during its experience. The system explores the environment and\niteratively: (a) discover options, (b) explore the environment using options,\n(c) abstract the knowledge collected and (d) plan. This paper proposes an\nalternative approach to implementing open-ended learning architectures\nexploiting low-level and high-level representations to extend its knowledge in\na virtuous loop.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"18 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Symbolic Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11756","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, AI systems have made remarkable progress in various tasks. Deep Reinforcement Learning(DRL) is an effective tool for agents to learn policies in low-level state spaces to solve highly complex tasks. Researchers have introduced Intrinsic Motivation(IM) to the RL mechanism, which simulates the agent's curiosity, encouraging agents to explore interesting areas of the environment. This new feature has proved vital in enabling agents to learn policies without being given specific goals. However, even though DRL intelligence emerges through a sub-symbolic model, there is still a need for a sort of abstraction to understand the knowledge collected by the agent. To this end, the classical planning formalism has been used in recent research to explicitly represent the knowledge an autonomous agent acquires and effectively reach extrinsic goals. Despite classical planning usually presents limited expressive capabilities, PPDDL demonstrated usefulness in reviewing the knowledge gathered by an autonomous system, making explicit causal correlations, and can be exploited to find a plan to reach any state the agent faces during its experience. This work presents a new architecture implementing an open-ended learning system able to synthesize from scratch its experience into a PPDDL representation and update it over time. Without a predefined set of goals and tasks, the system integrates intrinsic motivations to explore the environment in a self-directed way, exploiting the high-level knowledge acquired during its experience. The system explores the environment and iteratively: (a) discover options, (b) explore the environment using options, (c) abstract the knowledge collected and (d) plan. This paper proposes an alternative approach to implementing open-ended learning architectures exploiting low-level and high-level representations to extend its knowledge in a virtuous loop.

查看原文本刊更多论文

为自主系统合成不断演化的符号表征

最近，人工智能系统在各种任务中取得了显著进展。深度强化学习（DRL）是代理在低级状态空间中学习策略以解决高度复杂任务的有效工具。研究人员在强化学习机制中引入了内在动机（IM），它可以模拟代理的好奇心，鼓励代理探索环境中有趣的领域。事实证明，这一新功能非常重要，它能让代理在没有特定目标的情况下学习策略。不过，尽管 DRL 智能是通过子符号模型产生的，但仍需要进行一定的抽象才能理解代理收集的知识。为此，最近的研究中使用了经典规划形式来明确表示自主代理获取的知识，并有效地实现外在目标。尽管经典规划的表达能力通常有限，但 PPDDL 在回顾自主系统收集的知识、明确因果关系方面表现出了实用性，并可用于寻找计划，以达到代理在其经历过程中所面临的任何状态。这项工作提出了一种新的架构，实现了一种开放式学习系统，该系统能够从头开始将其经验合成为 PPDDL 表征，并随着时间的推移不断更新。在没有预定目标和任务的情况下，该系统利用在体验过程中获得的高级知识，整合内在动机，以自我导向的方式探索环境。该系统探索环境的过程包括：（a）发现选项；（b）利用选项探索环境；（c）对收集到的知识进行抽象；（d）制定计划。本文提出了实现开放式学习架构的替代方法，即利用低级和高级表征来扩展其知识的良性循环。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - CS - Symbolic Computation

自引率

0.00%

发文量