Lun Ge, Xiaoguang Zhou, Yongqiang Li, Yongcong Wang
{"title":"Deep reinforcement learning navigation via decision transformer in autonomous driving","authors":"Lun Ge, Xiaoguang Zhou, Yongqiang Li, Yongcong Wang","doi":"10.3389/fnbot.2024.1338189","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1338189","url":null,"abstract":"In real-world scenarios, making navigation decisions for autonomous driving involves a sequential set of steps. These judgments are made based on partial observations of the environment, while the underlying model of the environment remains unknown. A prevalent method for resolving such issues is reinforcement learning, in which the agent acquires knowledge through a succession of rewards in addition to fragmentary and noisy observations. This study introduces an algorithm named deep reinforcement learning navigation via decision transformer (DRLNDT) to address the challenge of enhancing the decision-making capabilities of autonomous vehicles operating in partially observable urban environments. The DRLNDT framework is built around the Soft Actor-Critic (SAC) algorithm. DRLNDT utilizes Transformer neural networks to effectively model the temporal dependencies in observations and actions. This approach aids in mitigating judgment errors that may arise due to sensor noise or occlusion within a given state. The process of extracting latent vectors from high-quality images involves the utilization of a variational autoencoder (VAE). This technique effectively reduces the dimensionality of the state space, resulting in enhanced training efficiency. The multimodal state space consists of vector states, including velocity and position, which the vehicle's intrinsic sensors can readily obtain. Additionally, latent vectors derived from high-quality images are incorporated to facilitate the Agent's assessment of the present trajectory. Experiments demonstrate that DRLNDT may achieve a superior optimal policy without prior knowledge of the environment, detailed maps, or routing assistance, surpassing the baseline technique and other policy methods that lack historical data.","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"117 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140170744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Human skill knowledge guided global trajectory policy reinforcement learning method","authors":"Yajing Zang, Pengfei Wang, Fusheng Zha, Wei Guo, Chuanfeng Li, Lining Sun","doi":"10.3389/fnbot.2024.1368243","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1368243","url":null,"abstract":"Traditional trajectory learning methods based on Imitation Learning (IL) only learn the existing trajectory knowledge from human demonstration. In this way, it can not adapt the trajectory knowledge to the task environment by interacting with the environment and fine-tuning the policy. To address this problem, a global trajectory learning method which combinines IL with Reinforcement Learning (RL) to adapt the knowledge policy to the environment is proposed. In this paper, IL is proposed to acquire basic trajectory skills, and then learns the agent will explore and exploit more policy which is applicable to the current environment by RL. The basic trajectory skills include the knowledge policy and the time stage information in the whole task space to help learn the time series of the trajectory, and are used to guide the subsequent RL process. Notably, neural networks are not used to model the action policy and the Q value of RL during the RL process. Instead, they are sampled and updated in the whole task space and then transferred to the networks after the RL process through Behavior Cloning (BC) to get continuous and smooth global trajectory policy. The feasibility and the effectiveness of the method was validated in a custom Gym environment of a flower drawing task. And then, we executed the learned policy in the real-world robot drawing experiment.","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"495 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140151072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A reinforcement learning enhanced pseudo-inverse approach to self-collision avoidance of redundant robots","authors":"Tinghe Hong, Weibing Li, Kai Huang","doi":"10.3389/fnbot.2024.1375309","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1375309","url":null,"abstract":"<sec><title>Introduction</title><p>Redundant robots offer greater flexibility compared to non-redundant ones but are susceptible to increased collision risks when the end-effector approaches the robot's own links. Redundant degrees of freedom (DoFs) present an opportunity for collision avoidance; however, selecting an appropriate inverse kinematics (IK) solution remains challenging due to the infinite possible solutions.</p></sec><sec><title>Methods</title><p>This study proposes a reinforcement learning (RL) enhanced pseudo-inverse approach to address self-collision avoidance in redundant robots. The RL agent is integrated into the redundancy resolution process of a pseudo-inverse method to determine a suitable IK solution for avoiding self-collisions during task execution. Additionally, an improved replay buffer is implemented to enhance the performance of the RL algorithm.</p></sec><sec><title>Results</title><p>Simulations and experiments validate the effectiveness of the proposed method in reducing the risk of self-collision in redundant robots.</p></sec><sec><title>Conclusion</title><p>The RL enhanced pseudo-inverse approach presented in this study demonstrates promising results in mitigating self-collision risks in redundant robots, highlighting its potential for enhancing safety and performance in robotic systems.</p></sec>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"17 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140313516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RFG-TVIU: robust factor graph for tightly coupled vision/IMU/UWB integration","authors":"Gongjun Fan, Qing Wang, Gaochao Yang, Pengfei Liu","doi":"10.3389/fnbot.2024.1343644","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1343644","url":null,"abstract":"High precision navigation and positioning technology, as a fundamental function, is gradually occupying an indispensable position in the various fields. However, a single sensor cannot meet the navigation requirements in different scenarios. This paper proposes a “plug and play” Vision/IMU/UWB multi-sensor tightly-coupled system based on factor graph. The difference from traditional UWB-based tightly-coupled models is that the Vision/IMU/UWB tightly-coupled model in this study uses UWB base station coordinates as parameters for real-time estimation without pre-calibrating UWB base stations. Aiming at the dynamic change of sensor availability in multi-sensor integrated navigation system and the serious problem of traditional factor graph in the weight distribution of observation information, this study proposes an adaptive robust factor graph model. Based on redundant measurement information, we propose a novel adaptive estimation model for UWB ranging covariance, which does not rely on prior information of the system and can adaptively estimate real-time covariance changes of UWB ranging. The algorithm proposed in this study was extensively tested in real-world scenarios, and the results show that the proposed system is superior to the most advanced combination method in all cases. Compared with the visual-inertial odometer based on the factor graph (FG-VIO), the RMSE is improved by 62.83 and 64.26% in scene 1 and 82.15, 70.32, and 75.29% in scene 2 (non-line-of-sight environment).","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"8 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140810336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Brain-inspired semantic data augmentation for multi-style images","authors":"Wei Wang, Zhaowei Shang, Chengxing Li","doi":"10.3389/fnbot.2024.1382406","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1382406","url":null,"abstract":"<p>Data augmentation is an effective technique for automatically expanding training data in deep learning. Brain-inspired methods are approaches that draw inspiration from the functionality and structure of the human brain and apply these mechanisms and principles to artificial intelligence and computer science. When there is a large style difference between training data and testing data, common data augmentation methods cannot effectively enhance the generalization performance of the deep model. To solve this problem, we improve modeling Domain Shifts with Uncertainty (DSU) and propose a new brain-inspired computer vision image data augmentation method which consists of two key components, namely, <italic>using Robust statistics and controlling the Coefficient of variance for DSU</italic> (RCDSU) and <italic>Feature Data Augmentation</italic> (FeatureDA). RCDSU calculates feature statistics (mean and standard deviation) with robust statistics to weaken the influence of outliers, making the statistics close to the real values and improving the robustness of deep learning models. By controlling the coefficient of variance, RCDSU makes the feature statistics shift with semantic preservation and increases shift range. FeatureDA controls the coefficient of variance similarly to generate the augmented features with semantics unchanged and increase the coverage of augmented features. RCDSU and FeatureDA are proposed to perform style transfer and content transfer in the feature space, and improve the generalization ability of the model at the style and content level respectively. On Photo, Art Painting, Cartoon, and Sketch (PACS) multi-style classification task, RCDSU plus FeatureDA achieves competitive accuracy. After adding Gaussian noise to PACS dataset, RCDSU plus FeatureDA shows strong robustness against outliers. FeatureDA achieves excellent results on CIFAR-100 image classification task. RCDSU plus FeatureDA can be applied as a novel brain-inspired semantic data augmentation method with implicit robot automation which is suitable for datasets with large style differences between training and testing data.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140300167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhonglin Ye, Yanlong Tang, Haixing Zhao, Zhaoyang Wang, Ying Ji
{"title":"Multi-channel high-order network representation learning research","authors":"Zhonglin Ye, Yanlong Tang, Haixing Zhao, Zhaoyang Wang, Ying Ji","doi":"10.3389/fnbot.2024.1340462","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1340462","url":null,"abstract":"The existing network representation learning algorithms mainly model the relationship between network nodes based on the structural features of the network, or use text features, hierarchical features and other external attributes to realize the network joint representation learning. Capturing global features of the network allows the obtained node vectors to retain more comprehensive feature information during training, thereby enhancing the quality of embeddings. In order to preserve the global structural features of the network in the training results, we employed a multi-channel learning approach to perform high-order feature modeling on the network. We proposed a novel algorithm for multi-channel high-order network representation learning, referred to as the Multi-Channel High-Order Network Representation (MHNR) algorithm. This algorithm initially constructs high-order network features from the original network structure, thereby transforming the single-channel network representation learning process into a multi-channel high-order network representation learning process. Then, for each single-channel network representation learning process, the novel graph assimilation mechanism is introduced in the algorithm, so as to realize the high-order network structure modeling mechanism in the single-channel network representation learning. Finally, the algorithm integrates the multi-channel and single-channel mechanism of high-order network structure joint modeling, realizing the efficient use of network structure features and sufficient modeling. Experimental results show that the node classification performance of the proposed MHNR algorithm reaches a good order on Citeseer, Cora, and DBLP data, and its node classification performance is better than that of the comparison algorithm used in this paper. In addition, when the vector length is optimized, the average classification accuracy of nodes of the proposed algorithm is up to 12.24% higher than that of the DeepWalk algorithm. Therefore, the node classification performance of the proposed algorithm can reach the current optimal order only based on the structural features of the network under the condition of no external feature supplementary modeling.","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"52 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-02-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140006199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep learning-based control framework for dynamic contact processes in humanoid grasping","authors":"Shaowen Cheng, Yongbin Jin, Hongtao Wang","doi":"10.3389/fnbot.2024.1349752","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1349752","url":null,"abstract":"Humanoid grasping is a critical ability for anthropomorphic hand, and plays a significant role in the development of humanoid robots. In this article, we present a deep learning-based control framework for humanoid grasping, incorporating the dynamic contact process among the anthropomorphic hand, the object, and the environment. This method efficiently eliminates the constraints imposed by inaccessible grasping points on both the contact surface of the object and the table surface. To mimic human-like grasping movements, an underactuated anthropomorphic hand is utilized, which is designed based on human hand data. The utilization of hand gestures, rather than controlling each motor separately, has significantly decreased the control dimensionality. Additionally, a deep learning framework is used to select gestures and grasp actions. Our methodology, proven both in simulation and on real robot, exceeds the performance of static analysis-based methods, as measured by the standard grasp metric <jats:italic>Q</jats:italic><jats:sub>1</jats:sub>. It expands the range of objects the system can handle, effectively grasping thin items such as cards on tables, a task beyond the capabilities of previous methodologies.","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"6 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140006200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HiDeS: a higher-order-derivative-supervised neural ordinary differential equation for multi-robot systems and opinion dynamics","authors":"Meng Li, Wenyu Bian, Liangxiong Chen, Mei Liu","doi":"10.3389/fnbot.2024.1382305","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1382305","url":null,"abstract":"<p>This paper addresses the limitations of current neural ordinary differential equations (NODEs) in modeling and predicting complex dynamics by introducing a novel framework called higher-order-derivative-supervised (HiDeS) NODE. This method extends traditional NODE frameworks by incorporating higher-order derivatives and their interactions into the modeling process, thereby enabling the capture of intricate system behaviors. In addition, the HiDeS NODE employs both the state vector and its higher-order derivatives as supervised signals, which is different from conventional NODEs that utilize only the state vector as a supervised signal. This approach is designed to enhance the predicting capability of NODEs. Through extensive experiments in the complex fields of multi-robot systems and opinion dynamics, the HiDeS NODE demonstrates improved modeling and predicting capabilities over existing models. This research not only proposes an expressive and predictive framework for dynamic systems but also marks the first application of NODEs to the fields of multi-robot systems and opinion dynamics, suggesting broad potential for future interdisciplinary work. The code is available at <ext-link ext-link-type=\"uri\" xlink:href=\"https://github.com/MengLi-Thea/HiDeS-A-Higher-Order-Derivative-Supervised-Neural-Ordinary-Differential-Equation\" xmlns:xlink=\"http://www.w3.org/1999/xlink\">https://github.com/MengLi-Thea/HiDeS-A-Higher-Order-Derivative-Supervised-Neural-Ordinary-Differential-Equation</ext-link>.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"93 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140108081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A data-driven acceleration-level scheme for image-based visual servoing of manipulators with unknown structure","authors":"Liuyi Wen, Zhengtai Xie","doi":"10.3389/fnbot.2024.1380430","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1380430","url":null,"abstract":"<p>The research on acceleration-level visual servoing of manipulators is crucial yet insufficient, which restricts the potential application range of visual servoing. To address this issue, this paper proposes a quadratic programming-based acceleration-level image-based visual servoing (AIVS) scheme, which considers joint constraints. Besides, aiming to address the unknown problems in visual servoing systems, a data-driven learning algorithm is proposed to facilitate estimating structural information. Building upon this foundation, a data-driven acceleration-level image-based visual servoing (DAIVS) scheme is proposed, integrating learning and control capabilities. Subsequently, a recurrent neural network (RNN) is developed to tackle the DAIVS scheme, followed by theoretical analyses substantiating its stability. Afterwards, simulations and experiments on a Franka Emika Panda manipulator with eye-in-hand structure and comparisons among the existing methods are provided. The obtained results demonstrate the feasibility and practicality of the proposed schemes and highlight the superior learning and control ability of the proposed RNN. This method is particularly well-suited for visual servoing applications of manipulators with unknown structure.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"122 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140170748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}