Hybrid Residual Multiexpert Reinforcement Learning for Spatial Scheduling of High-Density Parking Lots

IF 9.4 1区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

IEEE Transactions on Cybernetics Pub Date : 2023-10-23 DOI:10.1109/TCYB.2023.3312647

Jing Hou;Guang Chen;Zhijun Li;Wei He;Shangding Gu;Alois Knoll;Changjun Jiang

{"title":"Hybrid Residual Multiexpert Reinforcement Learning for Spatial Scheduling of High-Density Parking Lots","authors":"Jing Hou;Guang Chen;Zhijun Li;Wei He;Shangding Gu;Alois Knoll;Changjun Jiang","doi":"10.1109/TCYB.2023.3312647","DOIUrl":null,"url":null,"abstract":"Industries, such as manufacturing, are accelerating their embrace of the metaverse to achieve higher productivity, especially in complex industrial scheduling. In view of the growing parking challenges in large cities, high-density vehicle spatial scheduling is one of the potential solutions. Stack-based parking lots utilize parking robots to densely park vehicles in the vertical stacks like container stacking, which greatly reduces the aisle area in the parking lot, but requires complex scheduling algorithms to park and take out the vehicles. The existing high-density parking (HDP) scheduling algorithms are mainly heuristic methods, which only contain simple logic and are difficult to utilize information effectively. We propose a hybrid residual multiexpert (HIRE) reinforcement learning (RL) approach, a method for interactive learning in the digital industrial metaverse, which efficiently solves the HDP batch space scheduling problem. In our proposed framework, each heuristic scheduling method is considered as an expert. The neural network trained by RL assigns the expert strategy according to the current parking lot state. Furthermore, to avoid being limited by heuristic expert performance, the proposed hierarchical network framework also sets up a residual output channel. Experiments show that our proposed algorithm outperforms various advanced heuristic methods and the end-to-end RL method in the number of vehicle maneuvers, and has good robustness to the parking lot size and the estimation accuracy of vehicle exit time. We believe that the proposed HIRE RL method can be effectively and conveniently applied to practical application scenarios, which can be regarded as a key step for RL to enter the practical application stage of the industrial metaverse.","PeriodicalId":13112,"journal":{"name":"IEEE Transactions on Cybernetics","volume":"54 5","pages":"2771-2783"},"PeriodicalIF":9.4000,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10290933/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Industries, such as manufacturing, are accelerating their embrace of the metaverse to achieve higher productivity, especially in complex industrial scheduling. In view of the growing parking challenges in large cities, high-density vehicle spatial scheduling is one of the potential solutions. Stack-based parking lots utilize parking robots to densely park vehicles in the vertical stacks like container stacking, which greatly reduces the aisle area in the parking lot, but requires complex scheduling algorithms to park and take out the vehicles. The existing high-density parking (HDP) scheduling algorithms are mainly heuristic methods, which only contain simple logic and are difficult to utilize information effectively. We propose a hybrid residual multiexpert (HIRE) reinforcement learning (RL) approach, a method for interactive learning in the digital industrial metaverse, which efficiently solves the HDP batch space scheduling problem. In our proposed framework, each heuristic scheduling method is considered as an expert. The neural network trained by RL assigns the expert strategy according to the current parking lot state. Furthermore, to avoid being limited by heuristic expert performance, the proposed hierarchical network framework also sets up a residual output channel. Experiments show that our proposed algorithm outperforms various advanced heuristic methods and the end-to-end RL method in the number of vehicle maneuvers, and has good robustness to the parking lot size and the estimation accuracy of vehicle exit time. We believe that the proposed HIRE RL method can be effectively and conveniently applied to practical application scenarios, which can be regarded as a key step for RL to enter the practical application stage of the industrial metaverse.

查看原文本刊更多论文

混合残差多专家强化学习在高密度停车场空间调度中的应用。

制造业等行业正在加速接受元宇宙，以实现更高的生产力，尤其是在复杂的工业调度中。鉴于大城市日益增长的停车挑战，高密度的车辆空间调度是潜在的解决方案之一。基于堆叠的停车场利用停车机器人将车辆密集地停放在垂直堆叠中，如集装箱堆叠，这大大减少了停车场的过道面积，但需要复杂的调度算法来停放和取出车辆。现有的高密度停车调度算法主要是启发式算法，逻辑简单，难以有效利用信息。我们提出了一种混合残差多专家（HIRE）强化学习（RL）方法，这是一种在数字工业元宇宙中进行交互式学习的方法，它有效地解决了HDP批处理空间调度问题。在我们提出的框架中，每种启发式调度方法都被视为专家。RL训练的神经网络根据当前停车场状态分配专家策略。此外，为了避免启发式专家性能的限制，所提出的分层网络框架还设置了残差输出通道。实验表明，我们提出的算法在车辆机动次数方面优于各种先进的启发式方法和端到端RL方法，对停车场大小和车辆退出时间的估计精度具有良好的鲁棒性。我们相信，所提出的HIRE RL方法可以有效、方便地应用于实际应用场景，这可以被视为RL进入工业元宇宙实际应用阶段的关键一步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Cybernetics COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, CYBERNETICS

CiteScore

25.40

自引率

11.00%

发文量

1869

期刊介绍： The scope of the IEEE Transactions on Cybernetics includes computational approaches to the field of cybernetics. Specifically, the transactions welcomes papers on communication and control across machines or machine, human, and organizations. The scope includes such areas as computational intelligence, computer vision, neural networks, genetic algorithms, machine learning, fuzzy systems, cognitive systems, decision making, and robotics, to the extent that they contribute to the theme of cybernetics or demonstrate an application of cybernetics principles.