利用深度强化学习为基于多钻石的可重构机器人制定完整的覆盖规划

IF 7.5 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence Pub Date : 2024-10-22 DOI:10.1016/j.engappai.2024.109424

Anh Vu Le , Dinh Tung Vo , Nguyen Tien Dat , Minh Bui Vu , Mohan Rajesh Elara

{"title":"利用深度强化学习为基于多钻石的可重构机器人制定完整的覆盖规划","authors":"Anh Vu Le , Dinh Tung Vo , Nguyen Tien Dat , Minh Bui Vu , Mohan Rajesh Elara","doi":"10.1016/j.engappai.2024.109424","DOIUrl":null,"url":null,"abstract":"<div><div>Achieving complete coverage in complex areas is a critical objective for tilling tasks such as cleaning, painting, maintenance, and inspection. However, existing robots in the market, with their fixed morphologies, face limitations when it comes to accessing confined spaces. Reconfigurable tiling robots provide a feasible solution to this challenge. By shapeshifting among the available morphologies to adapt to the different conditions of complex environments, these robots can enhance the efficiency of complete coverage. However, the ability to change shape is constrained by energy usage considerations. Hence, it is important to have an optimal strategy to generate a trajectory that covers confined areas with minimal reconfiguration actions while taking into account the finite set of possible shapes. This paper proposes a complete coverage planning (CCP) framework for a reconfigurable tiling robot called hTetrakis, which consists of three polyiamonds blocks. The CCP framework leverages Deep Reinforcement Learning (DRL) to derive an optimal action policy within a polyiamonds shape-based workspace. By maximizing cumulative rewards to optimize the overall kinetic energy-based costweight, the proposed DRL model plans the hTetrakis shapes and its trajectories simultaneously. To this end, the DRL model utilizes Convolutional Neural Networks (CNNs) with Long Short-Term Memory (LSTM) network and adopts the Actor–Critic deep reinforcement learning agent with Experience Replay (ACER) approach for off-policy decision-making. By producing trajectories with reduced costs and time, the proposed CCP framework surpasses conventional heuristic optimization methods like Particle Swarm Optimization (PSO), Differential Evolution (DE), Genetic Algorithm (GA) and Ant Colony Optimization (ACO) rely on tiling strategies.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"138 ","pages":"Article 109424"},"PeriodicalIF":7.5000,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Complete coverage planning using Deep Reinforcement Learning for polyiamonds-based reconfigurable robot\",\"authors\":\"Anh Vu Le , Dinh Tung Vo , Nguyen Tien Dat , Minh Bui Vu , Mohan Rajesh Elara\",\"doi\":\"10.1016/j.engappai.2024.109424\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Achieving complete coverage in complex areas is a critical objective for tilling tasks such as cleaning, painting, maintenance, and inspection. However, existing robots in the market, with their fixed morphologies, face limitations when it comes to accessing confined spaces. Reconfigurable tiling robots provide a feasible solution to this challenge. By shapeshifting among the available morphologies to adapt to the different conditions of complex environments, these robots can enhance the efficiency of complete coverage. However, the ability to change shape is constrained by energy usage considerations. Hence, it is important to have an optimal strategy to generate a trajectory that covers confined areas with minimal reconfiguration actions while taking into account the finite set of possible shapes. This paper proposes a complete coverage planning (CCP) framework for a reconfigurable tiling robot called hTetrakis, which consists of three polyiamonds blocks. The CCP framework leverages Deep Reinforcement Learning (DRL) to derive an optimal action policy within a polyiamonds shape-based workspace. By maximizing cumulative rewards to optimize the overall kinetic energy-based costweight, the proposed DRL model plans the hTetrakis shapes and its trajectories simultaneously. To this end, the DRL model utilizes Convolutional Neural Networks (CNNs) with Long Short-Term Memory (LSTM) network and adopts the Actor–Critic deep reinforcement learning agent with Experience Replay (ACER) approach for off-policy decision-making. By producing trajectories with reduced costs and time, the proposed CCP framework surpasses conventional heuristic optimization methods like Particle Swarm Optimization (PSO), Differential Evolution (DE), Genetic Algorithm (GA) and Ant Colony Optimization (ACO) rely on tiling strategies.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"138 \",\"pages\":\"Article 109424\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-10-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197624015823\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197624015823","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

实现复杂区域的完全覆盖是清洁、喷漆、维护和检查等耕作任务的关键目标。然而，市场上现有的机器人形态固定，在进入狭窄空间时受到限制。可重新配置的铲运机器人为这一挑战提供了可行的解决方案。通过在现有形态中变换形态以适应复杂环境的不同条件，这些机器人可以提高全面覆盖的效率。然而，改变形状的能力受到能源使用方面的限制。因此，重要的是要有一个最佳策略，在考虑到有限的可能形态集的同时，以最小的重新配置行动生成覆盖有限区域的轨迹。本文为一种名为 hTetrakis 的可重构平铺机器人提出了一个完整覆盖规划（CCP）框架，该机器人由三个多钻块组成。CCP 框架利用深度强化学习（DRL）在基于多钻石形状的工作空间内推导出最佳行动策略。通过最大化累积奖励来优化基于动能的总体成本权重，拟议的 DRL 模型可同时规划 hTetrakis 形状及其轨迹。为此，DRL 模型利用带有长短期记忆（LSTM）网络的卷积神经网络（CNN），并采用带有经验重放（ACER）的行为批判深度强化学习代理方法进行非政策决策。所提出的 CCP 框架能以更低的成本和更短的时间生成轨迹，超越了传统的启发式优化方法，如粒子群优化 (PSO)、差分进化 (DE)、遗传算法 (GA) 和蚁群优化 (ACO)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Complete coverage planning using Deep Reinforcement Learning for polyiamonds-based reconfigurable robot

Achieving complete coverage in complex areas is a critical objective for tilling tasks such as cleaning, painting, maintenance, and inspection. However, existing robots in the market, with their fixed morphologies, face limitations when it comes to accessing confined spaces. Reconfigurable tiling robots provide a feasible solution to this challenge. By shapeshifting among the available morphologies to adapt to the different conditions of complex environments, these robots can enhance the efficiency of complete coverage. However, the ability to change shape is constrained by energy usage considerations. Hence, it is important to have an optimal strategy to generate a trajectory that covers confined areas with minimal reconfiguration actions while taking into account the finite set of possible shapes. This paper proposes a complete coverage planning (CCP) framework for a reconfigurable tiling robot called hTetrakis, which consists of three polyiamonds blocks. The CCP framework leverages Deep Reinforcement Learning (DRL) to derive an optimal action policy within a polyiamonds shape-based workspace. By maximizing cumulative rewards to optimize the overall kinetic energy-based costweight, the proposed DRL model plans the hTetrakis shapes and its trajectories simultaneously. To this end, the DRL model utilizes Convolutional Neural Networks (CNNs) with Long Short-Term Memory (LSTM) network and adopts the Actor–Critic deep reinforcement learning agent with Experience Replay (ACER) approach for off-policy decision-making. By producing trajectories with reduced costs and time, the proposed CCP framework surpasses conventional heuristic optimization methods like Particle Swarm Optimization (PSO), Differential Evolution (DE), Genetic Algorithm (GA) and Ant Colony Optimization (ACO) rely on tiling strategies.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Engineering Applications of Artificial Intelligence 工程技术-工程：电子与电气

CiteScore

9.60

自引率

10.00%

发文量

505

审稿时长

68 days

期刊介绍： Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.