Kaiyu Wang , Danni Wang , Bohao Qu , Menglin Zhang , Xianchang Wang , Ximing Li
{"title":"A multi-agent deep reinforcement learning method for fully noisy observations","authors":"Kaiyu Wang , Danni Wang , Bohao Qu , Menglin Zhang , Xianchang Wang , Ximing Li","doi":"10.1016/j.engappai.2025.111553","DOIUrl":null,"url":null,"abstract":"<div><div>Multi-agent reinforcement learning (MARL) algorithms have achieved great breakthroughs in many aspects. The MARL algorithms can learn effective policies in ideal simulation environments. But different from the ideal simulation environments, noise is unavoidable in the real world. MARL algorithms need to learn effective policies in unavoidable fully noisy environments. In this paper, we consider a challenging multi-agent reinforcement learning problem: All agents cannot observe any noiseless observations from environments during the whole training process and MARL algorithms cannot learn effective policies in these fully noisy observation environments. To solve this problem, we propose a method called Robust <span><math><mi>P</mi></math></span>olicy <span><math><mi>L</mi></math></span>earning under Fully Noisy Observation vi<span><math><mi>A</mi></math></span> De<span><math><mi>N</mi></math></span>oising R<span><math><mi>E</mi></math></span>presentation Ne<span><math><mi>T</mi></math></span>work (PLANET), which enables MARL algorithms learning effective policies in fully noisy observation environments. The PLANET method learns the effective policy through two steps. (1) Extracting the noise characteristics and motion laws to obtain clean observations information from fully noisy observation histories. (2) Making MARL algorithms extract information from the noise characteristics and motion laws information, and learn effective policies. The results of a series of exhaustive experiments show that our method can mitigate the effects of noise and learn effective policies in fully noisy observation environments. Our Artificial Intelligence contribution lies in introducing the denoising representation network that learns noise characteristics and motion dynamics to recover clean observations from fully noisy observations. The proposed PLANET framework could be applied to real-world multi-agent robotic and sensor network systems, potentially improving policy robustness under fully noisy observation.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111553"},"PeriodicalIF":8.0000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625015556","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Multi-agent reinforcement learning (MARL) algorithms have achieved great breakthroughs in many aspects. The MARL algorithms can learn effective policies in ideal simulation environments. But different from the ideal simulation environments, noise is unavoidable in the real world. MARL algorithms need to learn effective policies in unavoidable fully noisy environments. In this paper, we consider a challenging multi-agent reinforcement learning problem: All agents cannot observe any noiseless observations from environments during the whole training process and MARL algorithms cannot learn effective policies in these fully noisy observation environments. To solve this problem, we propose a method called Robust olicy earning under Fully Noisy Observation vi Deoising Rpresentation Nework (PLANET), which enables MARL algorithms learning effective policies in fully noisy observation environments. The PLANET method learns the effective policy through two steps. (1) Extracting the noise characteristics and motion laws to obtain clean observations information from fully noisy observation histories. (2) Making MARL algorithms extract information from the noise characteristics and motion laws information, and learn effective policies. The results of a series of exhaustive experiments show that our method can mitigate the effects of noise and learn effective policies in fully noisy observation environments. Our Artificial Intelligence contribution lies in introducing the denoising representation network that learns noise characteristics and motion dynamics to recover clean observations from fully noisy observations. The proposed PLANET framework could be applied to real-world multi-agent robotic and sensor network systems, potentially improving policy robustness under fully noisy observation.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.