基于深度强化学习的异构自主水面车辆优化水体塑料垃圾收集

IF 4.6 2区计算机科学 Q2 ROBOTICS

IEEE Robotics and Automation Letters Pub Date : 2025-03-28 DOI:10.1109/LRA.2025.3555940

Alejandro Mendoza Barrionuevo;Samuel Yanes Luis;Daniel Gutiérrez Reina;Sergio L. Toral Marín

{"title":"基于深度强化学习的异构自主水面车辆优化水体塑料垃圾收集","authors":"Alejandro Mendoza Barrionuevo;Samuel Yanes Luis;Daniel Gutiérrez Reina;Sergio L. Toral Marín","doi":"10.1109/LRA.2025.3555940","DOIUrl":null,"url":null,"abstract":"This letter presents a model-free deep reinforcement learning framework for informative path planning with heterogeneous fleets of autonomous surface vehicles to locate and collect plastic waste. The system employs two teams of vehicles: scouts and cleaners. Coordination between these teams is achieved through a deep reinforcement approach, allowing agents to learn strategies to maximize cleaning efficiency. The primary objective is for the scout team to provide an up-to-date contamination model, while the cleaner team collects as much waste as possible following this model. This strategy leads to heterogeneous teams that optimize fleet efficiency through inter-team cooperation supported by a tailored reward function. Different trainings of the proposed algorithm are compared with other state-of-the-art algorithms in three distinct scenarios, one with moderate convexity, another with narrow corridors and challenging access, and the last one larger, more complex and with more difficult to access shape. According to the obtained results, it is demonstrated that deep reinforcement learning based algorithms outperform baselines, exhibiting superior adaptability. In addition, training with examples of actions from other algorithms further improves performance, especially in scenarios where the search space is larger.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 5","pages":"4930-4937"},"PeriodicalIF":4.6000,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10945400","citationCount":"0","resultStr":"{\"title\":\"Optimizing Plastic Waste Collection in Water Bodies Using Heterogeneous Autonomous Surface Vehicles With Deep Reinforcement Learning\",\"authors\":\"Alejandro Mendoza Barrionuevo;Samuel Yanes Luis;Daniel Gutiérrez Reina;Sergio L. Toral Marín\",\"doi\":\"10.1109/LRA.2025.3555940\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This letter presents a model-free deep reinforcement learning framework for informative path planning with heterogeneous fleets of autonomous surface vehicles to locate and collect plastic waste. The system employs two teams of vehicles: scouts and cleaners. Coordination between these teams is achieved through a deep reinforcement approach, allowing agents to learn strategies to maximize cleaning efficiency. The primary objective is for the scout team to provide an up-to-date contamination model, while the cleaner team collects as much waste as possible following this model. This strategy leads to heterogeneous teams that optimize fleet efficiency through inter-team cooperation supported by a tailored reward function. Different trainings of the proposed algorithm are compared with other state-of-the-art algorithms in three distinct scenarios, one with moderate convexity, another with narrow corridors and challenging access, and the last one larger, more complex and with more difficult to access shape. According to the obtained results, it is demonstrated that deep reinforcement learning based algorithms outperform baselines, exhibiting superior adaptability. In addition, training with examples of actions from other algorithms further improves performance, especially in scenarios where the search space is larger.\",\"PeriodicalId\":13241,\"journal\":{\"name\":\"IEEE Robotics and Automation Letters\",\"volume\":\"10 5\",\"pages\":\"4930-4937\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-03-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10945400\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Robotics and Automation Letters\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10945400/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10945400/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

摘要

这封信提出了一个无模型的深度强化学习框架，用于信息路径规划，使用异构的自动地面车辆车队来定位和收集塑料废物。该系统使用了两组车辆：侦察兵和清洁工。这些团队之间的协调是通过深度强化方法实现的，允许代理学习策略以最大限度地提高清洁效率。主要目标是为侦察队提供最新的污染模型，而清洁队则根据该模型收集尽可能多的废物。这种策略导致异质团队，通过定制奖励功能支持的团队间合作来优化车队效率。在三种不同的场景下，将所提出算法的不同训练与其他最先进的算法进行比较，一种是中等凹凸度，另一种是狭窄的走廊和具有挑战性的通道，最后一种是更大，更复杂，更难以进入的形状。根据得到的结果，表明基于深度强化学习的算法优于基线，具有优越的适应性。此外，使用来自其他算法的动作示例进行训练可以进一步提高性能，特别是在搜索空间较大的场景中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Optimizing Plastic Waste Collection in Water Bodies Using Heterogeneous Autonomous Surface Vehicles With Deep Reinforcement Learning

This letter presents a model-free deep reinforcement learning framework for informative path planning with heterogeneous fleets of autonomous surface vehicles to locate and collect plastic waste. The system employs two teams of vehicles: scouts and cleaners. Coordination between these teams is achieved through a deep reinforcement approach, allowing agents to learn strategies to maximize cleaning efficiency. The primary objective is for the scout team to provide an up-to-date contamination model, while the cleaner team collects as much waste as possible following this model. This strategy leads to heterogeneous teams that optimize fleet efficiency through inter-team cooperation supported by a tailored reward function. Different trainings of the proposed algorithm are compared with other state-of-the-art algorithms in three distinct scenarios, one with moderate convexity, another with narrow corridors and challenging access, and the last one larger, more complex and with more difficult to access shape. According to the obtained results, it is demonstrated that deep reinforcement learning based algorithms outperform baselines, exhibiting superior adaptability. In addition, training with examples of actions from other algorithms further improves performance, especially in scenarios where the search space is larger.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Robotics and Automation Letters Computer Science-Computer Science Applications

CiteScore

9.60

自引率

15.40%

发文量

1428

期刊介绍： The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.