Nicolò Dal Fabbro, Arman Adibi, Aritra Mitra, George J. Pappas
{"title":"Finite-Time Analysis of Asynchronous Multi-Agent TD Learning","authors":"Nicolò Dal Fabbro, Arman Adibi, Aritra Mitra, George J. Pappas","doi":"arxiv-2407.20441","DOIUrl":null,"url":null,"abstract":"Recent research endeavours have theoretically shown the beneficial effect of\ncooperation in multi-agent reinforcement learning (MARL). In a setting\ninvolving $N$ agents, this beneficial effect usually comes in the form of an\n$N$-fold linear convergence speedup, i.e., a reduction - proportional to $N$ -\nin the number of iterations required to reach a certain convergence precision.\nIn this paper, we show for the first time that this speedup property also holds\nfor a MARL framework subject to asynchronous delays in the local agents'\nupdates. In particular, we consider a policy evaluation problem in which\nmultiple agents cooperate to evaluate a common policy by communicating with a\ncentral aggregator. In this setting, we study the finite-time convergence of\n\\texttt{AsyncMATD}, an asynchronous multi-agent temporal difference (TD)\nlearning algorithm in which agents' local TD update directions are subject to\nasynchronous bounded delays. Our main contribution is providing a finite-time\nanalysis of \\texttt{AsyncMATD}, for which we establish a linear convergence\nspeedup while highlighting the effect of time-varying asynchronous delays on\nthe resulting convergence rate.","PeriodicalId":501315,"journal":{"name":"arXiv - CS - Multiagent Systems","volume":"51 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Multiagent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.20441","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Recent research endeavours have theoretically shown the beneficial effect of
cooperation in multi-agent reinforcement learning (MARL). In a setting
involving $N$ agents, this beneficial effect usually comes in the form of an
$N$-fold linear convergence speedup, i.e., a reduction - proportional to $N$ -
in the number of iterations required to reach a certain convergence precision.
In this paper, we show for the first time that this speedup property also holds
for a MARL framework subject to asynchronous delays in the local agents'
updates. In particular, we consider a policy evaluation problem in which
multiple agents cooperate to evaluate a common policy by communicating with a
central aggregator. In this setting, we study the finite-time convergence of
\texttt{AsyncMATD}, an asynchronous multi-agent temporal difference (TD)
learning algorithm in which agents' local TD update directions are subject to
asynchronous bounded delays. Our main contribution is providing a finite-time
analysis of \texttt{AsyncMATD}, for which we establish a linear convergence
speedup while highlighting the effect of time-varying asynchronous delays on
the resulting convergence rate.