Frontiers in NeuroroboticsPub Date : 2024-11-15eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1478181
Ruiying Pan
{"title":"Multimodal fusion-powered English speaking robot.","authors":"Ruiying Pan","doi":"10.3389/fnbot.2024.1478181","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1478181","url":null,"abstract":"<p><strong>Introduction: </strong>Speech recognition and multimodal learning are two critical areas in machine learning. Current multimodal speech recognition systems often encounter challenges such as high computational demands and model complexity.</p><p><strong>Methods: </strong>To overcome these issues, we propose a novel framework-EnglishAL-Net, a Multimodal Fusion-powered English Speaking Robot. This framework leverages the ALBEF model, optimizing it for real-time speech and multimodal interaction, and incorporates a newly designed text and image editor to fuse visual and textual information. The robot processes dynamic spoken input through the integration of Neural Machine Translation (NMT), enhancing its ability to understand and respond to spoken language.</p><p><strong>Results and discussion: </strong>In the experimental section, we constructed a dataset containing various scenarios and oral instructions for testing. The results show that compared to traditional unimodal processing methods, our model significantly improves both language understanding accuracy and response time. This research not only enhances the performance of multimodal interaction in robots but also opens up new possibilities for applications of robotic technology in education, rescue, customer service, and other fields, holding significant theoretical and practical value.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1478181"},"PeriodicalIF":2.6,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11604748/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142768250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in NeuroroboticsPub Date : 2024-11-14eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1490198
Hao Hu, Rui Wang, Hao Lin, Huai Yu
{"title":"UnionCAM: enhancing CNN interpretability through denoising, weighted fusion, and selective high-quality class activation mapping.","authors":"Hao Hu, Rui Wang, Hao Lin, Huai Yu","doi":"10.3389/fnbot.2024.1490198","DOIUrl":"10.3389/fnbot.2024.1490198","url":null,"abstract":"<p><p>Deep convolutional neural networks (CNNs) have achieved remarkable success in various computer vision tasks. However, the lack of interpretability in these models has raised concerns and hindered their widespread adoption in critical domains. Generating activation maps that highlight the regions contributing to the CNN's decision has emerged as a popular approach to visualize and interpret these models. Nevertheless, existing methods often produce activation maps contaminated with irrelevant background noise or incomplete object activation, limiting their effectiveness in providing meaningful explanations. To address this challenge, we propose Union Class Activation Mapping (UnionCAM), an innovative visual interpretation framework that generates high-quality class activation maps (CAMs) through a novel three-step approach. UnionCAM introduces a weighted fusion strategy that adaptively combines multiple CAMs to create more informative and comprehensive activation maps. First, the denoising module removes background noise from CAMs by using adaptive thresholding. Subsequently, the union module fuses the denoised CAMs with region-based CAMs using a weighted combination scheme to obtain more comprehensive and informative maps, which we refer to as fused CAMs. Lastly, the activation map selection module automatically selects the optimal CAM that offers the best interpretation from the pool of fused CAMs. Extensive experiments on ILSVRC2012 and VOC2007 datasets demonstrate UnionCAM's superior performance over state-of-the-art methods. It effectively suppresses background noise, captures complete object regions, and provides intuitive visual explanations. UnionCAM achieves significant improvements in insertion and deletion scores, outperforming the best baseline. UnionCAM makes notable contributions by introducing a novel denoising strategy, adaptive fusion of CAMs, and an automatic selection mechanism. It bridges the gap between CNN performance and interpretability, providing a valuable tool for understanding and trusting CNN-based systems. UnionCAM has the potential to foster responsible deployment of CNNs in real-world applications.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1490198"},"PeriodicalIF":2.6,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11602493/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142750018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in NeuroroboticsPub Date : 2024-11-11eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1499703
Jiuling Dong, Zehui Li, Yuanshuo Zheng, Jingtang Luo, Min Zhang, Xiaolong Yang
{"title":"Real-time fault detection for IIoT facilities using GA-Att-LSTM based on edge-cloud collaboration.","authors":"Jiuling Dong, Zehui Li, Yuanshuo Zheng, Jingtang Luo, Min Zhang, Xiaolong Yang","doi":"10.3389/fnbot.2024.1499703","DOIUrl":"10.3389/fnbot.2024.1499703","url":null,"abstract":"<p><p>With the rapid development of Industrial Internet of Things (IIoT) technology, various IIoT devices are generating large amounts of industrial sensor data that are spatiotemporally correlated and heterogeneous from multi-source and multi-domain. This poses a challenge to current detection algorithms. Therefore, this paper proposes an improved long short-term memory (LSTM) neural network model based on the genetic algorithm, attention mechanism and edge-cloud collaboration (GA-Att-LSTM) framework is proposed to detect anomalies of IIoT facilities. Firstly, an edge-cloud collaboration framework is established to real-time process a large amount of sensor data at the edge node in real time, which reduces the time of uploading sensor data to the cloud platform. Secondly, to overcome the problem of insufficient attention to important features in the input sequence in traditional LSTM algorithms, we introduce an attention mechanism to adaptively adjust the weights of important features in the model. Meanwhile, a genetic algorithm optimized hyperparameters of the LSTM neural network is proposed to transform anomaly detection into a classification problem and effectively extract the correlation of time-series data, which improves the recognition rate of fault detection. Finally, the proposed method has been evaluated on a publicly available fault database. The results indicate an accuracy of 99.6%, an F1-score of 84.2%, a precision of 89.8%, and a recall of 77.6%, all of which exceed the performance of five traditional machine learning methods.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1499703"},"PeriodicalIF":2.6,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11586361/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142716239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in NeuroroboticsPub Date : 2024-11-08eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1484038
Hadi Sedigh Malekroodi, Seon-Deok Seo, Jinseong Choi, Chang-Soo Na, Byeong-Il Lee, Myunggi Yi
{"title":"Real-time location of acupuncture points based on anatomical landmarks and pose estimation models.","authors":"Hadi Sedigh Malekroodi, Seon-Deok Seo, Jinseong Choi, Chang-Soo Na, Byeong-Il Lee, Myunggi Yi","doi":"10.3389/fnbot.2024.1484038","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1484038","url":null,"abstract":"<p><strong>Introduction: </strong>Precise identification of acupuncture points (acupoints) is essential for effective treatment, but manual location by untrained individuals can often lack accuracy and consistency. This study proposes two approaches that use artificial intelligence (AI) specifically computer vision to automatically and accurately identify acupoints on the face and hand in real-time, enhancing both precision and accessibility in acupuncture practices.</p><p><strong>Methods: </strong>The first approach applies a real-time landmark detection system to locate 38 specific acupoints on the face and hand by translating anatomical landmarks from image data into acupoint coordinates. The second approach uses a convolutional neural network (CNN) specifically optimized for pose estimation to detect five key acupoints on the arm and hand (LI11, LI10, TE5, TE3, LI4), drawing on constrained medical imaging data for training. To validate these methods, we compared the predicted acupoint locations with those annotated by experts.</p><p><strong>Results: </strong>Both approaches demonstrated high accuracy, with mean localization errors of less than 5 mm when compared to expert annotations. The landmark detection system successfully mapped multiple acupoints across the face and hand even in complex imaging scenarios. The data-driven approach accurately detected five arm and hand acupoints with a mean Average Precision (mAP) of 0.99 at OKS 50%.</p><p><strong>Discussion: </strong>These AI-driven methods establish a solid foundation for the automated localization of acupoints, enhancing both self-guided and professional acupuncture practices. By enabling precise, real-time localization of acupoints, these technologies could improve the accuracy of treatments, facilitate self-training, and increase the accessibility of acupuncture. Future developments could expand these models to include additional acupoints and incorporate them into intuitive applications for broader use.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1484038"},"PeriodicalIF":2.6,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11609928/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142768252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in NeuroroboticsPub Date : 2024-11-06eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1484751
Jinlin Wang, Yulong Ji, Hongyu Yang
{"title":"Vahagn: VisuAl Haptic Attention Gate Net for slip detection.","authors":"Jinlin Wang, Yulong Ji, Hongyu Yang","doi":"10.3389/fnbot.2024.1484751","DOIUrl":"10.3389/fnbot.2024.1484751","url":null,"abstract":"<p><strong>Introduction: </strong>Slip detection is crucial for achieving stable grasping and subsequent operational tasks. A grasp action is a continuous process that requires information from multiple sources. The success of a specific grasping maneuver is contingent upon the confluence of two factors: the spatial accuracy of the contact and the stability of the continuous process.</p><p><strong>Methods: </strong>In this paper, for the task of perceiving grasping results using visual-haptic information, we propose a new method for slip detection, which synergizes visual and haptic information from spatial-temporal dual dimensions. Specifically, the method takes as input a sequence of visual images from a first-person perspective and a sequence of haptic images from a gripper. Then, it extracts time-dependent features of the whole process and spatial features matching the importance of different parts with different attention mechanisms. Inspired by neurological studies, during the information fusion process, we adjusted temporal and spatial information from vision and haptic through a combination of two-step fusion and gate units.</p><p><strong>Results and discussion: </strong>To validate the effectiveness of method, we compared it with traditional CNN net and models with attention. It is anticipated that our method achieves a classification accuracy of 93.59%, which is higher than that of previous works. Attention visualization is further presented to support the validity.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1484751"},"PeriodicalIF":2.6,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11576469/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142681508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in NeuroroboticsPub Date : 2024-10-31eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1453061
An Jianliang
{"title":"A multimodal educational robots driven via dynamic attention.","authors":"An Jianliang","doi":"10.3389/fnbot.2024.1453061","DOIUrl":"10.3389/fnbot.2024.1453061","url":null,"abstract":"<p><strong>Introduction: </strong>With the development of artificial intelligence and robotics technology, the application of educational robots in teaching is becoming increasingly popular. However, effectively evaluating and optimizing multimodal educational robots remains a challenge.</p><p><strong>Methods: </strong>This study introduces Res-ALBEF, a multimodal educational robot framework driven by dynamic attention. Res-ALBEF enhances the ALBEF (Align Before Fuse) method by incorporating residual connections to align visual and textual data more effectively before fusion. In addition, the model integrates a VGG19-based convolutional network for image feature extraction and utilizes a dynamic attention mechanism to dynamically focus on relevant parts of multimodal inputs. Our model was trained using a diverse dataset consisting of 50,000 multimodal educational instances, covering a variety of subjects and instructional content.</p><p><strong>Results and discussion: </strong>The evaluation on an independent validation set of 10,000 samples demonstrated significant performance improvements: the model achieved an overall accuracy of 97.38% in educational content recognition. These results highlight the model's ability to improve alignment and fusion of multimodal information, making it a robust solution for multimodal educational robots.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1453061"},"PeriodicalIF":2.6,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11560911/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142618559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in NeuroroboticsPub Date : 2024-10-31eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1457843
Dong Chen, Peisong Wu, Mingdong Chen, Mengtao Wu, Tao Zhang, Chuanqi Li
{"title":"LS-VIT: Vision Transformer for action recognition based on long and short-term temporal difference.","authors":"Dong Chen, Peisong Wu, Mingdong Chen, Mengtao Wu, Tao Zhang, Chuanqi Li","doi":"10.3389/fnbot.2024.1457843","DOIUrl":"10.3389/fnbot.2024.1457843","url":null,"abstract":"<p><p>Over the past few years, a growing number of researchers have dedicated their efforts to focusing on temporal modeling. The advent of transformer-based methods has notably advanced the field of 2D image-based vision tasks. However, with respect to 3D video tasks such as action recognition, applying temporal transformations directly to video data significantly increases both computational and memory demands. This surge in resource consumption is due to the multiplication of data patches and the added complexity of self-aware computations. Accordingly, building efficient and precise 3D self-attentive models for video content represents as a major challenge for transformers. In our research, we introduce an Long and Short-term Temporal Difference Vision Transformer (LS-VIT). This method incorporates short-term motion details into images by weighting the difference across several consecutive frames, thereby equipping the original image with the ability to model short-term motions. Concurrently, we integrate a module designed to understand long-term motion details. This module enhances the model's capacity for long-term motion modeling by directly integrating temporal differences from various segments via motion excitation. Our thorough analysis confirms that the LS-VIT achieves high recognition accuracy across multiple benchmarks (e.g., UCF101, HMDB51, Kinetics-400). These research results indicate that LS-VIT has the potential for further optimization, which can improve real-time performance and action prediction capabilities.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1457843"},"PeriodicalIF":2.6,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11560894/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142618575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in NeuroroboticsPub Date : 2024-10-31eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1443010
Haneen Alsuradi, Joseph Hong, Helin Mazi, Mohamad Eid
{"title":"Neuro-motor controlled wearable augmentations: current research and emerging trends.","authors":"Haneen Alsuradi, Joseph Hong, Helin Mazi, Mohamad Eid","doi":"10.3389/fnbot.2024.1443010","DOIUrl":"10.3389/fnbot.2024.1443010","url":null,"abstract":"<p><p>Wearable augmentations (WAs) designed for movement and manipulation, such as exoskeletons and supernumerary robotic limbs, are used to enhance the physical abilities of healthy individuals and substitute or restore lost functionality for impaired individuals. Non-invasive neuro-motor (NM) technologies, including electroencephalography (EEG) and sufrace electromyography (sEMG), promise direct and intuitive communication between the brain and the WA. After presenting a historical perspective, this review proposes a conceptual model for NM-controlled WAs, analyzes key design aspects, such as hardware design, mounting methods, control paradigms, and sensory feedback, that have direct implications on the user experience, and in the long term, on the embodiment of WAs. The literature is surveyed and categorized into three main areas: hand WAs, upper body WAs, and lower body WAs. The review concludes by highlighting the primary findings, challenges, and trends in NM-controlled WAs. This review motivates researchers and practitioners to further explore and evaluate the development of WAs, ensuring a better quality of life.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1443010"},"PeriodicalIF":2.6,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11560910/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142618577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in NeuroroboticsPub Date : 2024-10-29eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1503038
Paloma de la Puente, Markus Vincze, Diego Guffanti, Daniel Galan
{"title":"Editorial: Assistive and service robots for health and home applications (RH3 - Robot Helpers in Health and Home).","authors":"Paloma de la Puente, Markus Vincze, Diego Guffanti, Daniel Galan","doi":"10.3389/fnbot.2024.1503038","DOIUrl":"https://doi.org/10.3389/fnbot.2024.1503038","url":null,"abstract":"","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1503038"},"PeriodicalIF":2.6,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11554614/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142618571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frontiers in NeuroroboticsPub Date : 2024-10-22eCollection Date: 2024-01-01DOI: 10.3389/fnbot.2024.1488337
Lei Wang, Danping Liu, Jun Wang
{"title":"A modified A* algorithm combining remote sensing technique to collect representative samples from unmanned surface vehicles.","authors":"Lei Wang, Danping Liu, Jun Wang","doi":"10.3389/fnbot.2024.1488337","DOIUrl":"10.3389/fnbot.2024.1488337","url":null,"abstract":"<p><p>Ensuring representativeness of collected samples is the most critical requirement of water sampling. Unmanned surface vehicles (USVs) have been widely adopted in water sampling, but current USV sampling path planning tend to overemphasize path optimization, neglecting the representative samples collection. This study proposed a modified A* algorithm that combined remote sensing technique while considering both path length and the representativeness of collected samples. Water quality parameters were initially retrieved using satellite remote sensing imagery and a deep belief network model, with the parameter value incorporated as coefficient <i>Q</i> in the heuristic function of A* algorithm. The adjustment coefficient <i>k</i> was then introduced into the coefficient <i>Q</i> to optimize the trade-off between sampling representativeness and path length. To evaluate the effectiveness of this algorithm, Chlorophyll-a concentration (Chl-a) was employed as the test parameter, with Chaohu Lake as the study area. Results showed that the algorithm was effective in collecting more representative samples in real-world conditions. As the coefficient <i>k</i> increased, the representativeness of collected samples enhanced, indicated by the Chl-a closely approximating the overall mean Chl-a and exhibiting a gradient distribution. This enhancement was also associated with increased path length. This study is significant in USV water sampling and water environment protection.</p>","PeriodicalId":12628,"journal":{"name":"Frontiers in Neurorobotics","volume":"18 ","pages":"1488337"},"PeriodicalIF":2.6,"publicationDate":"2024-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11535655/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142582574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}