Diogo S. Carvalho, Pedro A. Santos, Francisco S. Melo
{"title":"Reinforcement learning in convergently non-stationary environments: Feudal hierarchies and learned representations","authors":"Diogo S. Carvalho, Pedro A. Santos, Francisco S. Melo","doi":"10.1016/j.artint.2025.104382","DOIUrl":null,"url":null,"abstract":"<div><div>We study the convergence of <em>Q</em>-learning-based methods in convergently non-stationary environments, particularly in the context of hierarchical reinforcement learning and of dynamic features encountered in deep reinforcement learning. We demonstrate that <em>Q</em>-learning achieves convergence in tabular representations when applied to convergently non-stationary dynamics, such as the ones arising in a feudal hierarchical setting. Additionally, we establish convergence for <em>Q</em>-learning-based deep reinforcement learning methods with convergently non-stationary features, such as the ones arising in representation-based settings. Our findings offer theoretical support for the application of <em>Q</em>-learning in these complex scenarios and present methodologies for extending established theoretical results from standard cases to their convergently non-stationary counterparts.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"347 ","pages":"Article 104382"},"PeriodicalIF":5.1000,"publicationDate":"2025-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0004370225001018","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
We study the convergence of Q-learning-based methods in convergently non-stationary environments, particularly in the context of hierarchical reinforcement learning and of dynamic features encountered in deep reinforcement learning. We demonstrate that Q-learning achieves convergence in tabular representations when applied to convergently non-stationary dynamics, such as the ones arising in a feudal hierarchical setting. Additionally, we establish convergence for Q-learning-based deep reinforcement learning methods with convergently non-stationary features, such as the ones arising in representation-based settings. Our findings offer theoretical support for the application of Q-learning in these complex scenarios and present methodologies for extending established theoretical results from standard cases to their convergently non-stationary counterparts.
期刊介绍:
The Journal of Artificial Intelligence (AIJ) welcomes papers covering a broad spectrum of AI topics, including cognition, automated reasoning, computer vision, machine learning, and more. Papers should demonstrate advancements in AI and propose innovative approaches to AI problems. Additionally, the journal accepts papers describing AI applications, focusing on how new methods enhance performance rather than reiterating conventional approaches. In addition to regular papers, AIJ also accepts Research Notes, Research Field Reviews, Position Papers, Book Reviews, and summary papers on AI challenges and competitions.