{"title":"Multiagent Task Allocation for Dynamic Intelligent Space: Auction and Preemption With Ontology Knowledge Graph","authors":"Wei Li, Jianhang Shang, Guoliang Liu, Zhenhua Liu, Guohui Tian","doi":"10.1049/csy2.70013","DOIUrl":"https://doi.org/10.1049/csy2.70013","url":null,"abstract":"<p>This paper introduces a pioneering dynamic system optimisation for multiagent (DySOMA) framework, revolutionising task scheduling in dynamic intelligent spaces with an emphasis on multirobot systems. The core of DySOMA is an advanced auction-based algorithm coupled with a novel task preemption ranking mechanism, seamlessly integrated with an ontology knowledge graph that dynamically updates. This integration not only enhances the efficiency of task allocation among robots but also significantly improves the adaptability of the system to environmental changes. Compared to other advanced algorithms, the DySOMA algorithm shows significant performance improvements, with its RLB 26.8% higher than that of the best-performing Consensus-Based Parallel Auction and Execution (CBPAE) algorithm at 10 robots and 29.7% higher at 20 robots, demonstrating its superior capability in balancing task loads and optimising task completion times in larger, more complex environments. DySOMA sets a new benchmark for intelligent robot task scheduling, promising significant advancements in the autonomy and flexibility of robotic systems in complex evolving environments.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70013","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143836152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Autonomous Navigation and Collision Avoidance for AGV in Dynamic Environments: An Enhanced Deep Reinforcement Learning Approach With Composite Rewards and Dynamic Update Mechanisms","authors":"Zijianglong Huang, Zhigang Ren, Tehuan Chen, Shengze Cai, Chao Xu","doi":"10.1049/csy2.70012","DOIUrl":"https://doi.org/10.1049/csy2.70012","url":null,"abstract":"<p>With the booming development of logistics, manufacturing and warehousing fields, the autonomous navigation and intelligent obstacle avoidance technology of automated guided vehicles (AGVs) has become the focus of scientific research. In this paper, an enhanced deep reinforcement learning (DRL) framework is proposed, aiming to empower AGVs with the ability of autonomous navigation and obstacle avoidance in the unknown and variable complex environment. To address the problems of time-consuming training and limited generalisation ability of traditional DRL, we refine the twin delayed deep deterministic policy gradient algorithm by integrating adaptive noise attenuation and dynamic delayed updating, optimising both training efficiency and model robustness. In order to further strengthen the AGV's ability to perceive and respond to changes of a dynamic environment, we introduce a distance-based obstacle penalty term in the designed composite reward function, which ensures that the AGV is capable of predicting and avoiding obstacles effectively in dynamic scenarios. Experiments indicate that the AGV model trained by this algorithm presents excellent autonomous navigation capability in both static and dynamic environments, with a high task completion rate, stable and reliable operation, which fully proves the high efficiency and robustness of this method and its practical value.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70012","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143822314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hybrid Attention Spike Transformer","authors":"Xiongfei Fan, Hong Zhang, Yu Zhang","doi":"10.1049/csy2.70010","DOIUrl":"https://doi.org/10.1049/csy2.70010","url":null,"abstract":"<p>Spike transformers cannot be pretrained due to objective factors such as lack of datasets and memory constraints, which results in a significant performance gap compared to pretrained artificial neural networks (ANNs), thereby hindering their practical applicability. To address this issue, we propose a hybrid attention spike transformer that utilises self-attention with compound tokens and channel attention-based token processing to better capture the inductive biases of the data. We also add convolution in patch splitting and feedforward networks, which not only provides local information but also leverages the translation invariance and locality of convolutions to help the model converge. Experiments on static datasets and neuromorphic datasets demonstrate that our method achieves state-of-the-art performance in the spiking neural networks (SNNs) field. Notably, we achieve a top-1 accuracy of 80.59% on CIFAR-100 with only 4 time steps. As far as we know, it is the first exploration of the spike transformer with multiattention fusion, achieving outstanding effectiveness.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70010","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143809409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RMF-ED: Real-Time Multimodal Fusion for Enhanced Target Detection in Low-Light Environments","authors":"Yuhong Wu, Jinkai Cui, Kuoye Niu, Yanlong Lu, Lijun Cheng, Shengze Cai, Chao Xu","doi":"10.1049/csy2.70011","DOIUrl":"https://doi.org/10.1049/csy2.70011","url":null,"abstract":"<p>Accurate target detection in low-light environments is crucial for unmanned aerial vehicles (UAVs) and autonomous driving applications. In this study, the authors introduce a real-time multimodal fusion for enhanced detection (RMF-ED), a novel framework designed to overcome the limitations of low-light target detection. By leveraging the complementary capabilities of near-infrared (NIR) cameras and light detection and ranging (LiDAR) sensors, RMF-ED enhances detection performance. An advanced NIR generative adversarial network (NIR-GAN) model was developed to address the lack of annotated NIR datasets, integrating structural similarity index measure (SSIM) loss and L1 loss functions. This approach enables the generation of high-quality NIR images from RGB datasets, bridging a critical gap in training data. Furthermore, the multimodal fusion algorithm integrates RGB images, NIR images, and LiDAR point clouds, ensuring consistency and accuracy in proposal fusion. Experimental results on the KITTI dataset demonstrate that RMF-ED achieves performance comparable to or exceeding state-of-the-art fusion algorithms, with a computational time of only 21 ms. These features make RMF-ED an efficient and versatile solution for real-time applications in low-light environments.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70011","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143793286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Flocking Navigation and Obstacle Avoidance for UAV Swarms Via Adaptive Risk Avoidance Willingness Mechanism","authors":"Chao Li, Xiaojia Xiang, Yihao Sun, Chao Yan, Yixin Huang, Tianjiang Hu, Han Zhou","doi":"10.1049/csy2.70009","DOIUrl":"https://doi.org/10.1049/csy2.70009","url":null,"abstract":"<p>A swarm of unmanned aerial vehicles (UAVs) has been widely used in both military and civilian fields due to its advantages of high cost-effectiveness, high task efficiency and strong survivability. However, there are still challenges in flocking control of UAV swarms in complex environments with various obstacles. In this paper, we propose a flocking control and obstacle avoidance method for UAV swarms, which is called willingness control method (WCM). Specifically, we propose an adaptive risk avoidance willingness (ARAW) mechanism, in which each UAV has an ARAW coefficient representing its ARAW. As the distance from danger gets closer, the ARAW of the UAV to avoid danger increases. On this basis, an obstacle avoidance method for UAV swarms is designed, and an informed individual mechanism influenced by neighbour repulsion is introduced. By combining the hierarchical weighting Vicsek model (HWVEM), the UAV swarm system can simultaneously balance flocking navigation and obstacle avoidance tasks and adjust the priority of different tasks adaptively during the task process. Finally, under local communication constraints of the UAV, a series of simulation experiments as well as real-word experiments with up to 12 UAVs are conducted to verify the security and compactness of the proposed method.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70009","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143770495","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Safe Braking and Distance Prediction for Overhead Cranes With Multivariation Using MLP","authors":"Tenglong Zhang, Guoliang Liu, Huili Chen, Guohui Tian, Qingqiang Guo","doi":"10.1049/csy2.70007","DOIUrl":"https://doi.org/10.1049/csy2.70007","url":null,"abstract":"<p>The emergency braking and braking distance prediction of an overhead crane pose challenging issues in its safe operation. This paper employs a multilayer perceptron (MLP) to implement an adaptive safe distance prediction functionality for an overhead crane with multiple variations. First, a discrete model of an overhead crane is constructed, and a model predictive control (MPC) model with angle constraints is applied for safe braking. Second, we analysed and selected the input variations of the safe distance prediction model. Subsequently, we permuted the inputs to the MLP and analysed the effect of each input on the accuracy of the MLP in predicting safety distances separately. We constructed a training dataset, and a test dataset and we optimised the safe distance prediction model through the training dataset. Finally, we conducted a comparative analysis between the MLP and nlinfit algorithms, highlighting the superiority of MLP-based adaptive safety distance prediction for bridge cranes. Experiments confirm the method's ability to ensure minimal swing angle during the entire braking process to achieve safe braking. The results underscore the practical utility and novelty of the proposed algorithm.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143749682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Move to See More: Approaching Object With Partial Occlusion Using Large Multimodal Model and Active Object Detection","authors":"Aoqi Wang, Guohui Tian, Yuhao Wang, Zhongyang Li","doi":"10.1049/csy2.70008","DOIUrl":"https://doi.org/10.1049/csy2.70008","url":null,"abstract":"<p>Active object detection (AOD) is a crucial task in the field of robotics. A key challenge in household environments for AOD is that the target object is often undetectable due to partial occlusion, which leads to the failure of traditional methods. To address the occlusion problem, this paper first proposes a novel occlusion handling method based on the large multimodal model (LMM). The method utilises an LMM to detect and analyse input RGB images and generates adjustment actions to progressively eliminate occlusion. After the occlusion is handled, an improved AOD method based on a deep Q-learning network (DQN) is used to complete the task. We introduce an attention mechanism to process image features, enabling the model to focus on critical regions of the input images. Additionally, a new reward function is proposed that comprehensively considers the bounding box of the target object and the robot's distance to the object, along with the actions performed by the robot. Experiments on the dataset and in real-world scenarios validate the effectiveness of the proposed method in performing AOD tasks under partial occlusion.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70008","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143707626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bioinspired framework for real-time collision detection with dynamic obstacles in cluttered outdoor environments using event cameras","authors":"Meriem Ben Miled, Wenwen Liu, Yuanchang Liu","doi":"10.1049/csy2.70006","DOIUrl":"https://doi.org/10.1049/csy2.70006","url":null,"abstract":"<p>In the field of robotics and visual-based navigation, event cameras are gaining popularity due to their exceptional dynamic range, low power consumption, and rapid response capabilities. These neuromorphic devices facilitate the efficient detection and avoidance of fast moving obstacles, and address common limitations of traditional hardware. However, the majority of state-of-the-art event-based algorithms still rely on conventional computer vision strategies. The goal is to shift from the standard protocols for dynamic obstacle detection by taking inspiration from the time-computational paradigm of biological vision system. In this paper, the authors present an innovative framework inspired by a biological response mechanism triggered by approaching objects, enabling the perception and identification of potential collision threats. The method, validated through both simulation and real-world experimentation, charts a new path in the application of event cameras for dynamic obstacle detection and avoidance in autonomous unmanned aerial vehicles. When compared to conventional methods, the proposed approach demonstrates a success rate of 97% in detecting obstacles within real-world outdoor settings.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"7 1","pages":""},"PeriodicalIF":1.5,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70006","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143629855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fei Yan, Guangyao Jin, Zheng Mu, Shouxing Zhang, Yinghao Cai, Tao Lu, Yan Zhuang
{"title":"Novel vision-LiDAR fusion framework for human action recognition based on dynamic lateral connection","authors":"Fei Yan, Guangyao Jin, Zheng Mu, Shouxing Zhang, Yinghao Cai, Tao Lu, Yan Zhuang","doi":"10.1049/csy2.70005","DOIUrl":"https://doi.org/10.1049/csy2.70005","url":null,"abstract":"<p>In the past decades, substantial progress has been made in human action recognition. However, most existing studies and datasets for human action recognition utilise still images or videos as the primary modality. Image-based approaches can be easily impacted by adverse environmental conditions. In this paper, the authors propose combining RGB images and point clouds from LiDAR sensors for human action recognition. A dynamic lateral convolutional network (DLCN) is proposed to fuse features from multi-modalities. The RGB features and the geometric information from the point clouds closely interact with each other in the DLCN, which is complementary in action recognition. The experimental results on the JRDB-Act dataset demonstrate that the proposed DLCN outperforms the state-of-the-art approaches of human action recognition. The authors show the potential of the proposed DLCN in various complex scenarios, which is highly valuable in real-world applications.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"6 4","pages":""},"PeriodicalIF":1.5,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70005","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143121447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Big2Small: Learning from masked image modelling with heterogeneous self-supervised knowledge distillation","authors":"Ziming Wang, Shumin Han, Xiaodi Wang, Jing Hao, Xianbin Cao, Baochang Zhang","doi":"10.1049/csy2.70002","DOIUrl":"https://doi.org/10.1049/csy2.70002","url":null,"abstract":"<p>Small convolutional neural network (CNN)-based models usually require transferring knowledge from a large model before they are deployed in computationally resource-limited edge devices. Masked image modelling (MIM) methods achieve great success in various visual tasks but remain largely unexplored in knowledge distillation for heterogeneous deep models. The reason is mainly due to the significant discrepancy between the transformer-based large model and the CNN-based small network. In this paper, the authors develop the first heterogeneous self-supervised knowledge distillation (HSKD) based on MIM, which can efficiently transfer knowledge from large transformer models to small CNN-based models in a self-supervised fashion. Our method builds a bridge between transformer-based models and CNNs by training a UNet-style student with sparse convolution, which can effectively mimic the visual representation inferred by a teacher over masked modelling. Our method is a simple yet effective learning paradigm to learn the visual representation and distribution of data from heterogeneous teacher models, which can be pre-trained using advanced self-supervised methods. Extensive experiments show that it adapts well to various models and sizes, consistently achieving state-of-the-art performance in image classification, object detection, and semantic segmentation tasks. For example, in the Imagenet 1K dataset, HSKD improves the accuracy of Resnet-50 (sparse) from 76.98% to 80.01%.</p>","PeriodicalId":34110,"journal":{"name":"IET Cybersystems and Robotics","volume":"6 4","pages":""},"PeriodicalIF":1.5,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/csy2.70002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143121450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}