{"title":"Synthesizing Evolving Symbolic Representations for Autonomous Systems","authors":"Gabriele Sartor, Angelo Oddi, Riccardo Rasconi, Vieri Giuliano Santucci, Rosa Meo","doi":"arxiv-2409.11756","DOIUrl":null,"url":null,"abstract":"Recently, AI systems have made remarkable progress in various tasks. Deep\nReinforcement Learning(DRL) is an effective tool for agents to learn policies\nin low-level state spaces to solve highly complex tasks. Researchers have\nintroduced Intrinsic Motivation(IM) to the RL mechanism, which simulates the\nagent's curiosity, encouraging agents to explore interesting areas of the\nenvironment. This new feature has proved vital in enabling agents to learn\npolicies without being given specific goals. However, even though DRL\nintelligence emerges through a sub-symbolic model, there is still a need for a\nsort of abstraction to understand the knowledge collected by the agent. To this\nend, the classical planning formalism has been used in recent research to\nexplicitly represent the knowledge an autonomous agent acquires and effectively\nreach extrinsic goals. Despite classical planning usually presents limited\nexpressive capabilities, PPDDL demonstrated usefulness in reviewing the\nknowledge gathered by an autonomous system, making explicit causal\ncorrelations, and can be exploited to find a plan to reach any state the agent\nfaces during its experience. This work presents a new architecture implementing\nan open-ended learning system able to synthesize from scratch its experience\ninto a PPDDL representation and update it over time. Without a predefined set\nof goals and tasks, the system integrates intrinsic motivations to explore the\nenvironment in a self-directed way, exploiting the high-level knowledge\nacquired during its experience. The system explores the environment and\niteratively: (a) discover options, (b) explore the environment using options,\n(c) abstract the knowledge collected and (d) plan. This paper proposes an\nalternative approach to implementing open-ended learning architectures\nexploiting low-level and high-level representations to extend its knowledge in\na virtuous loop.","PeriodicalId":501033,"journal":{"name":"arXiv - CS - Symbolic Computation","volume":"18 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Symbolic Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11756","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, AI systems have made remarkable progress in various tasks. Deep
Reinforcement Learning(DRL) is an effective tool for agents to learn policies
in low-level state spaces to solve highly complex tasks. Researchers have
introduced Intrinsic Motivation(IM) to the RL mechanism, which simulates the
agent's curiosity, encouraging agents to explore interesting areas of the
environment. This new feature has proved vital in enabling agents to learn
policies without being given specific goals. However, even though DRL
intelligence emerges through a sub-symbolic model, there is still a need for a
sort of abstraction to understand the knowledge collected by the agent. To this
end, the classical planning formalism has been used in recent research to
explicitly represent the knowledge an autonomous agent acquires and effectively
reach extrinsic goals. Despite classical planning usually presents limited
expressive capabilities, PPDDL demonstrated usefulness in reviewing the
knowledge gathered by an autonomous system, making explicit causal
correlations, and can be exploited to find a plan to reach any state the agent
faces during its experience. This work presents a new architecture implementing
an open-ended learning system able to synthesize from scratch its experience
into a PPDDL representation and update it over time. Without a predefined set
of goals and tasks, the system integrates intrinsic motivations to explore the
environment in a self-directed way, exploiting the high-level knowledge
acquired during its experience. The system explores the environment and
iteratively: (a) discover options, (b) explore the environment using options,
(c) abstract the knowledge collected and (d) plan. This paper proposes an
alternative approach to implementing open-ended learning architectures
exploiting low-level and high-level representations to extend its knowledge in
a virtuous loop.