Hanming Zhai, Xiaojun Lv, Zhiwen Hou, Xin Tong, Fanliang Bu
{"title":"MLNet: a multi-level multimodal named entity recognition architecture.","authors":"Hanming Zhai, Xiaojun Lv, Zhiwen Hou, Xin Tong, Fanliang Bu","doi":"10.3389/fnbot.2023.1181143","DOIUrl":"https://doi.org/10.3389/fnbot.2023.1181143","url":null,"abstract":"<p><p>In the field of human-computer interaction, accurate identification of talking objects can help robots to accomplish subsequent tasks such as decision-making or recommendation; therefore, object determination is of great interest as a pre-requisite task. Whether it is named entity recognition (NER) in natural language processing (NLP) work or object detection (OD) task in the computer vision (CV) field, the essence is to achieve object recognition. Currently, multimodal approaches are widely used in basic image recognition and natural language processing tasks. This multimodal architecture can perform entity recognition tasks more accurately, but when faced with short texts and images containing more noise, we find that there is still room for optimization in the image-text-based multimodal named entity recognition (MNER) architecture. In this study, we propose a new multi-level multimodal named entity recognition architecture, which is a network capable of extracting useful visual information for boosting semantic understanding and subsequently improving entity identification efficacy. Specifically, we first performed image and text encoding separately and then built a symmetric neural network architecture based on Transformer for multimodal feature fusion. We utilized a gating mechanism to filter visual information that is significantly related to the textual content, in order to enhance text understanding and achieve semantic disambiguation. Furthermore, we incorporated character-level vector encoding to reduce text noise. Finally, we employed Conditional Random Fields for label classification task. Experiments on the Twitter dataset show that our model works to increase the accuracy of the MNER task.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"17 ","pages":"1181143"},"PeriodicalIF":3.1,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10319056/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10180393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"E-YOLOv4-tiny: a traffic sign detection algorithm for urban road scenarios.","authors":"Yanqiu Xiao, Shiao Yin, Guangzhen Cui, Weili Zhang, Lei Yao, Zhanpeng Fang","doi":"10.3389/fnbot.2023.1220443","DOIUrl":"https://doi.org/10.3389/fnbot.2023.1220443","url":null,"abstract":"<p><strong>Introduction: </strong>In urban road scenes, due to the small size of traffic signs and the large amount of surrounding interference information, current methods are difficult to achieve good detection results in the field of unmanned driving.</p><p><strong>Methods: </strong>To address the aforementioned challenges, this paper proposes an improved E-YOLOv4-tiny based on the YOLOv4-tiny. Firstly, this article constructs an efficient layer aggregation lightweight block with deep separable convolutions to enhance the feature extraction ability of the backbone. Secondly, this paper presents a feature fusion refinement module aimed at fully integrating multi-scale features. Moreover, this module incorporates our proposed efficient coordinate attention for refining interference information during feature transfer. Finally, this article proposes an improved S-RFB to add contextual feature information to the network, further enhancing the accuracy of traffic sign detection.</p><p><strong>Results and discussion: </strong>The method in this paper is tested on the CCTSDB dataset and the Tsinghua-Tencent 100K dataset. The experimental results show that the proposed method outperforms the original YOLOv4-tiny in traffic sign detection with 3.76% and 7.37% improvement in mAP, respectively, and 21% reduction in the number of parameters. Compared with other advanced methods, the method proposed in this paper achieves a better balance between accuracy, real-time performance, and the number of model parameters, which has better application value.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"17 ","pages":"1220443"},"PeriodicalIF":3.1,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10391168/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10290507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tracking by segmentation with future motion estimation applied to person-following robots.","authors":"Shenlu Jiang, Runze Cui, Runze Wei, Zhiyang Fu, Zhonghua Hong, Guofu Feng","doi":"10.3389/fnbot.2023.1255085","DOIUrl":"https://doi.org/10.3389/fnbot.2023.1255085","url":null,"abstract":"<p><p>Person-following is a crucial capability for service robots, and the employment of vision technology is a leading trend in building environmental understanding. While most existing methodologies rely on a tracking-by-detection strategy, which necessitates extensive datasets for training and yet remains susceptible to environmental noise, we propose a novel approach: real-time tracking-by-segmentation with a future motion estimation framework. This framework facilitates pixel-level tracking of a target individual and predicts their future motion. Our strategy leverages a single-shot segmentation tracking neural network for precise foreground segmentation to track the target, overcoming the limitations of using a rectangular region of interest (ROI). Here we clarify that, while the ROI provides a broad context, the segmentation within this bounding box offers a detailed and more accurate position of the human subject. To further improve our approach, a classification-lock pre-trained layer is utilized to form a constraint that curbs feature outliers originating from the person being tracked. A discriminative correlation filter estimates the potential target region in the scene to prevent foreground misrecognition, while a motion estimation neural network anticipates the target's future motion for use in the control module. We validated our proposed methodology using the VOT, LaSot, YouTube-VOS, and Davis tracking datasets, demonstrating its effectiveness. Notably, our framework supports long-term person-following tasks in indoor environments, showing promise for practical implementation in service robots.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"17 ","pages":"1255085"},"PeriodicalIF":3.1,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10494445/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10295017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruoxi Qin, Linyuan Wang, Xuehui Du, Pengfei Xie, Xingyuan Chen, Bin Yan
{"title":"Adversarial robustness in deep neural networks based on variable attributes of the stochastic ensemble model.","authors":"Ruoxi Qin, Linyuan Wang, Xuehui Du, Pengfei Xie, Xingyuan Chen, Bin Yan","doi":"10.3389/fnbot.2023.1205370","DOIUrl":"https://doi.org/10.3389/fnbot.2023.1205370","url":null,"abstract":"<p><p>Deep neural networks (DNNs) have been shown to be susceptible to critical vulnerabilities when attacked by adversarial samples. This has prompted the development of attack and defense strategies similar to those used in cyberspace security. The dependence of such strategies on attack and defense mechanisms makes the associated algorithms on both sides appear as closely processes, with the defense method being particularly passive in these processes. Inspired by the dynamic defense approach proposed in cyberspace to address endless arm races, this article defines ensemble quantity, network structure, and smoothing parameters as variable ensemble attributes and proposes a stochastic ensemble strategy based on heterogeneous and redundant sub-models. The proposed method introduces the diversity and randomness characteristic of deep neural networks to alter the fixed correspondence gradient between input and output. The unpredictability and diversity of the gradients make it more difficult for attackers to directly implement white-box attacks, helping to address the extreme transferability and vulnerability of ensemble models under white-box attacks. Experimental comparison of <i>ASR-vs.-distortion curves</i> with different attack scenarios under CIFAR10 preliminarily demonstrates the effectiveness of the proposed method that even the highest-capacity attacker cannot easily outperform the attack success rate associated with the ensemble smoothed model, especially for untargeted attacks.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"17 ","pages":"1205370"},"PeriodicalIF":3.1,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10442534/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10442621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-robot cooperative autonomous exploration via task allocation in terrestrial environments.","authors":"Xiangda Yan, Zhe Zeng, Keyan He, Huajie Hong","doi":"10.3389/fnbot.2023.1179033","DOIUrl":"https://doi.org/10.3389/fnbot.2023.1179033","url":null,"abstract":"<p><p>Cooperative autonomous exploration is a challenging task for multi-robot systems, which can cover larger areas in a shorter time or path length. Using multiple mobile robots for cooperative exploration of unknown environments can be more efficient than a single robot, but there are also many difficulties in multi-robot cooperative autonomous exploration. The key to successful multi-robot cooperative autonomous exploration is effective coordination between the robots. This paper designs a multi-robot cooperative autonomous exploration strategy for exploration tasks. Additionally, considering the fact that mobile robots are inevitably subject to failure in harsh conditions, we propose a self-healing cooperative autonomous exploration method that can recover from robot failures.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"17 ","pages":"1179033"},"PeriodicalIF":3.1,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10277487/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9710004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Context-aware lightweight remote-sensing image super-resolution network.","authors":"Guangwen Peng, Minghong Xie, Liuyang Fang","doi":"10.3389/fnbot.2023.1220166","DOIUrl":"https://doi.org/10.3389/fnbot.2023.1220166","url":null,"abstract":"<p><p>In recent years, remote-sensing image super-resolution (RSISR) methods based on convolutional neural networks (CNNs) have achieved significant progress. However, the limited receptive field of the convolutional kernel in CNNs hinders the network's ability to effectively capture long-range features in images, thus limiting further improvements in model performance. Additionally, the deployment of existing RSISR models to terminal devices is challenging due to their high computational complexity and large number of parameters. To address these issues, we propose a Context-Aware Lightweight Super-Resolution Network (CALSRN) for remote-sensing images. The proposed network primarily consists of Context-Aware Transformer Blocks (CATBs), which incorporate a Local Context Extraction Branch (LCEB) and a Global Context Extraction Branch (GCEB) to explore both local and global image features. Furthermore, a Dynamic Weight Generation Branch (DWGB) is designed to generate aggregation weights for global and local features, enabling dynamic adjustment of the aggregation process. Specifically, the GCEB employs a Swin Transformer-based structure to obtain global information, while the LCEB utilizes a CNN-based cross-attention mechanism to extract local information. Ultimately, global and local features are aggregated using the weights acquired from the DWGB, capturing the global and local dependencies of the image and enhancing the quality of super-resolution reconstruction. The experimental results demonstrate that the proposed method is capable of reconstructing high-quality images with fewer parameters and less computational complexity compared with existing methods.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"17 ","pages":"1220166"},"PeriodicalIF":3.1,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10326516/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9810140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"YOLOv7-CSAW for maritime target detection.","authors":"Qiang Zhu, Ke Ma, Zhong Wang, Peibei Shi","doi":"10.3389/fnbot.2023.1210470","DOIUrl":"https://doi.org/10.3389/fnbot.2023.1210470","url":null,"abstract":"<p><strong>Introduction: </strong>The issue of low detection rates and high false negative rates in maritime search and rescue operations has been a critical problem in current target detection algorithms. This is mainly due to the complex maritime environment and the small size of most targets. These challenges affect the algorithms' robustness and generalization.</p><p><strong>Methods: </strong>We proposed YOLOv7-CSAW, an improved maritime search and rescue target detection algorithm based on YOLOv7. We used the K-means++ algorithm for the optimal size determination of prior anchor boxes, ensuring an accurate match with actual objects. The C2f module was incorporated for a lightweight model capable of obtaining richer gradient flow information. The model's perception of small target features was increased with the non-parameter simple attention module (SimAM). We further upgraded the feature fusion network to an adaptive feature fusion network (ASFF) to address the lack of high-level semantic features in small targets. Lastly, we implemented the wise intersection over union (WIoU) loss function to tackle large positioning errors and missed detections.</p><p><strong>Results: </strong>Our algorithm was extensively tested on a maritime search and rescue dataset with YOLOv7 as the baseline model. We observed a significant improvement in the detection performance compared to traditional deep learning algorithms, with a mean average precision (mAP) improvement of 10.73% over the baseline model.</p><p><strong>Discussion: </strong>YOLOv7-CSAW significantly enhances the accuracy and robustness of small target detection in complex scenes. This algorithm effectively addresses the common issues experienced in maritime search and rescue operations, specifically improving the detection rates and reducing false negatives, proving to be a superior alternative to current target detection algorithms.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"17 ","pages":"1210470"},"PeriodicalIF":3.1,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10352484/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9835122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daegeun Park, Christian Di Natali, Matteo Sposito, Darwin G Caldwell, Jesus Ortiz
{"title":"Elbow-sideWINDER (Elbow-side Wearable INDustrial Ergonomic Robot): design, control, and validation of a novel elbow exoskeleton.","authors":"Daegeun Park, Christian Di Natali, Matteo Sposito, Darwin G Caldwell, Jesus Ortiz","doi":"10.3389/fnbot.2023.1168213","DOIUrl":"https://doi.org/10.3389/fnbot.2023.1168213","url":null,"abstract":"<p><p>Musculoskeletal Disorders associated with the elbow are one of the most common forms of work-related injuries. Exoskeletons have been proposed as an approach to reduce and ideally eliminate these injuries; however, exoskeletons introduce their own problems, especially discomfort due to joint misalignment. The Elbow-sideWINDER with its associated control strategy is a novel elbow exoskeleton to assist elbow flexion/extension during occupational tasks. This study describes the exoskeleton showing how this can minimize discomfort caused by joint misalignment, maximize assistive performance, and provide increased robustness and reliability in real worksites. The proposed medium-level control strategy can provide effective assistive torque using three control units as follows: an arm kinematics estimator, a load estimator, and a friction compensator. The combined hardware/software system of the Elbow-sideWINDER is tested in load-lifting tasks (2 and 7 <i>kg</i>). This experiment focuses on the reduction in the activation level of the biceps brachii and triceps brachii in both arms and the change in the range of motion of the elbow during the task. It is shown that using the Elbow-sideWINDER, the biceps brachii, responsible for the elbow flexion, was significantly less activated (up to 38.8% at 2 <i>kg</i> and 25.7% at 7 <i>kg</i>, on average for both arms). For the triceps brachii, the muscle activation was reduced by up to 37.0% at 2 <i>kg</i> and 35.1% at 7 <i>kg</i>, on average for both arms. When wearing the exoskeleton, the range of motion of the elbow was reduced by up to 13.0° during the task, but it was within a safe range and could be compensated for by other joints such as the waist or knees. There are extremely encouraging results that provide good indicators and important clues for future improvement of the Elbow-sideWINDER and its control strategy.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"17 ","pages":"1168213"},"PeriodicalIF":3.1,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10369055/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9889452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dual graph convolutional networks integrating affective knowledge and position information for aspect sentiment triplet extraction.","authors":"Yanbo Li, Qing He, Damin Zhang","doi":"10.3389/fnbot.2023.1193011","DOIUrl":"https://doi.org/10.3389/fnbot.2023.1193011","url":null,"abstract":"<p><p>Aspect Sentiment Triplet Extraction (ASTE) is a challenging task in natural language processing (NLP) that aims to extract triplets from comments. Each triplet comprises an aspect term, an opinion term, and the sentiment polarity of the aspect term. The neural network model developed for this task can enable robots to effectively identify and extract the most meaningful and relevant information from comment sentences, ultimately leading to better products and services for consumers. Most existing end-to-end models focus solely on learning the interactions between the three elements in a triplet and contextual words, ignoring the rich affective knowledge information contained in each word and paying insufficient attention to the relationships between multiple triplets in the same sentence. To address this gap, this study proposes a novel end-to-end model called the Dual Graph Convolutional Networks Integrating Affective Knowledge and Position Information (DGCNAP). This model jointly considers both the contextual features and the affective knowledge information by introducing the affective knowledge from SenticNet into the dependency graph construction of two parallel channels. In addition, a novel multi-target position-aware function is added to the graph convolutional network (GCN) to reduce the impact of noise information and capture the relationships between potential triplets in the same sentence by assigning greater positional weights to words that are in proximity to aspect or opinion terms. The experiment results on the ASTE-Data-V2 datasets demonstrate that our model outperforms other state-of-the-art models significantly, where the F1 scores on 14res, 14lap, 15res, and 16res are 70.72, 57.57, 61.19, and 69.58.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"17 ","pages":"1193011"},"PeriodicalIF":3.1,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10469445/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10208775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}