{"title":"Spatial–Temporal Spiking Feature Pruning in Spiking Transformer","authors":"Zhaokun Zhou;Kaiwei Che;Jun Niu;Man Yao;Guoqi Li;Li Yuan;Guibo Luo;Yuesheng Zhu","doi":"10.1109/TCDS.2024.3500018","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3500018","url":null,"abstract":"Spiking neural networks (SNNs) are known for brain-inspired architecture and low power consumption. Leveraging biocompatibility and self-attention mechanism, Spiking Transformers become the most promising SNN architecture with high accuracy. However, Spiking Transformers still faces the challenge of high training costs, such as a 51<inline-formula><tex-math>$M$</tex-math></inline-formula> network requiring 181 training hours on ImageNet. In this work, we explore feature pruning to reduce training costs and overcome two challenges: high pruning ratio and lightweight pruning methods. We first analyze the spiking features and find the potential for a high pruning ratio. The majority of information is concentrated on a part of the spiking features in spiking transformer, which suggests that we can keep this part of the tokens and prune the others. To achieve lightweight, a parameter-free spatial–temporal spiking feature pruning method is proposed, which uses only a simple addition-sorting operation. The spiking features/tokens with high spike accumulation values are selected for training. The others are pruned and merged through a compensation module called Softmatch. Experimental results demonstrate that our method reduces training costs without compromising image classification accuracy. On ImageNet, our approach reduces the training time from 181 to 128 h while achieving comparable accuracy (83.13% versus 83.07%).","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"644-658"},"PeriodicalIF":5.0,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interaction Is Worth More Explanations: Improving Human–Object Interaction Representation With Propositional Knowledge","authors":"Feng Yang;Yichao Cao;Xuanpeng Li;Weigong Zhang","doi":"10.1109/TCDS.2024.3496566","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3496566","url":null,"abstract":"Detecting human–object interactions (HOI) presents a formidable challenge, necessitating the discernment of intricate, high-level relationships between humans and objects. Recent studies have explored HOI vision-and-language modeling (HOI-VLM), which leverages linguistic information inspired by cross-modal technology. Despite its promise, current methodologies face challenges due to the constraints of limited annotation vocabularies and suboptimal word embeddings, which hinder effective alignment with visual features and consequently, the efficient transfer of linguistic knowledge. In this work, we propose a novel cross-modal framework that leverages external propositional knowledge which harmonize annotation text with a broader spectrum of world knowledge, enabling a more explicit and unambiguous representation of complex semantic relationships. Additionally, considering the prevalence of multiple complexities due to the symbiotic or distinctive relationships inherent in one HO pair, along with the identical interactions occurring with diverse HO pairs (e.g., “human ride bicycle” versus “human ride horse”). The challenge lies in understanding the subtle differences and similarities between interactions involving different objects or occurring in varied contexts. To this end, we propose the Jaccard contrast strategy to simultaneously optimize cross-modal representation consistency across HO pairs (especially for cases where multiple interactions occur), which encompasses both vision-to-vision and vision-to-knowledge alignment objectives. The effectiveness of our proposed method is comprehensively validated through extensive experiments, showcasing its superiority in the field of HOI analysis.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"631-643"},"PeriodicalIF":5.0,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling Task Engagement to Regulate Reinforcement Learning-Based Decoding for Online Brain Control","authors":"Xiang Zhang;Xiang Shen;Yiwen Wang","doi":"10.1109/TCDS.2024.3492199","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3492199","url":null,"abstract":"Brain–machine interfaces (BMIs) offer significant promise for enabling paralyzed individuals to control external devices using their brain signals. One challenge is that during the online brain control (BC) process, subjects may not be completely immersed in the task, particularly when multiple steps are needed to achieve a goal. The decoder indiscriminately takes the less engaged trials as training data, which might decrease the decoding accuracy. In this article, we propose an alternative kernel RL-based decoder that trains online with continuous parameter update. We model neural activity from the medial prefrontal cortex (mPFC), a reward-related brain region, to represent task engagement. This information is incorporated into a stochastic learning rate using an exponential model, which measures the relevancy of neural data. The proposed algorithm was evaluated in the experiment where rats performed a cursor-reaching BC task. We found the neural activities from mPFC contained the engagement information which was negatively correlated with trial response time. Moreover, compared to the RL method without task engagement modeling, our proposed method enhanced the training efficiency. It used half of the training data to achieve the same reconstruction accuracy of the cursor trajectory. The results demonstrate the potential of our RL framework for improving online BC tasks.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"606-614"},"PeriodicalIF":5.0,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Developmental Networks With Foveation","authors":"Xiang Wu;Juyang Weng","doi":"10.1109/TCDS.2024.3492181","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3492181","url":null,"abstract":"The foveated nature of the human vision system (HVS) means the acuity on the retina peaks at the center of the fovea and gradually descends to the periphery with increasing eccentricity. Foveation is general-purpose, meaning the fovea is more often used than the periphery. Self-generated saccades dynamically project the fovea to different parts of the visual world so that the high-acuity fovea can process interested parts at different times. It is still unclear why biological vision uses foveation. This work is the first foveated neural network as far as we are aware, but it has a limited scope. We study two subjects here as follows. 1) We design a biological density of cones (BDOCs) foveation method for image warping to simulate a biologically plausible foveated retina using a commonly available uniform-pixel camera. 2) The subject of this article is not specific to tasks, but we choose a challenging task, visual navigation, as an example of quantitative and spatiotemporal tasks, and compare it with deep learning. Our experimental results showed that 1) the BDOC foveation is logically and visually correct; and 2) the developmental network (DN) performs better than deep learning in a surprising way and foveation helps both network types.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"592-605"},"PeriodicalIF":5.0,"publicationDate":"2024-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lipeng Wang;Xiaochen Wang;Junjun Huang;Mengjie Liu
{"title":"Task and Motion Planning of Service Robot Arm in Unknown Environment Based on Virtual Voxel-Semantic Space","authors":"Lipeng Wang;Xiaochen Wang;Junjun Huang;Mengjie Liu","doi":"10.1109/TCDS.2024.3489773","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3489773","url":null,"abstract":"A task and motion planning method for service robot arm based on 3-D voxel-semantic maps is proposed, which can realize virtual environment mapping, manipulator planning, and grasping tasks in unknown environments. First of all, a complete point cloud scene is obtained and spliced. Mask region-based convolutional neural network (RCNN) network is used to complete object detection and instance segmentation. A voxel-semantic hybrid map composed of 3-D point cloud, semantic information, and 3-D computer aided design (CAD) model is constructed. Second, an improved A* algorithm is proposed to plan the optimal path of robot arm end-effector. The Bezier curve interpolation is introduced to obtain the smooth trajectory. Third, the grasping poses of the robot gripper corresponding to different geometries are explored. Semantic-driven spatial task planning is achieved by decomposing robotic arm pick and place tasks. Finally, the effectiveness and rapidity of the proposed algorithm are verified in virtual space and real physical space, respectively.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"564-576"},"PeriodicalIF":5.0,"publicationDate":"2024-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kai Shu;Le Wu;Yuchang Zhao;Aiping Liu;Ruobing Qian;Xun Chen
{"title":"Data Augmentation for Seizure Prediction With Generative Diffusion Model","authors":"Kai Shu;Le Wu;Yuchang Zhao;Aiping Liu;Ruobing Qian;Xun Chen","doi":"10.1109/TCDS.2024.3489357","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3489357","url":null,"abstract":"Data augmentation (DA) can significantly strengthen the electroencephalogram (EEG)-based seizure prediction methods. However, existing DA approaches are just the linear transformations of original data and cannot explore the feature space to increase diversity effectively. Therefore, we propose a novel diffusion-based DA method called DiffEEG. DiffEEG can fully explore data distribution and generate samples with high diversity, offering extra information to classifiers. It involves two processes: the diffusion process and the denoised process. In the diffusion process, the model incrementally adds noise with different scales to EEG input and converts it into random noise. In this way, the representation of data can be learned. In the denoised process, the model utilizes learned knowledge to sample synthetic data from random noise input by gradually removing noise. The randomness of input noise and the precise representation enable the synthetic samples to possess diversity while ensuring the consistency of feature space. We compared DiffEEG with original, down-sampling, sliding windows and recombination methods, and integrated them into five representative classifiers. The experiments demonstrate the effectiveness and generality of our method. With the contribution of DiffEEG, the multiscale CNN achieves state-of-the-art performance, with an average sensitivity, FPR, AUC of 95.4%, 0.051/h, 0.932 on the CHB-MIT database and 93.6%, 0.121/h, 0.822 on the Kaggle database.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"577-591"},"PeriodicalIF":5.0,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Adaptive Environment Generation for Continual Learning: Integrating Constraint Logic Programming With Deep Reinforcement Learning","authors":"Youness Boutyour;Abdellah Idrissi","doi":"10.1109/TCDS.2024.3485482","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3485482","url":null,"abstract":"In this article, we introduce a novel framework that combines constraint logic programming (CLP) with deep reinforcement learning (DRL) to create adaptive environments for continual learning. We focus on two challenging domains: Sudoku puzzles and scheduling problems, where environment complexity evolves based on the agent's performance. By integrating CLP, we dynamically adjust problem difficulty in response to the agent's learning trajectory, ensuring a progressively challenging environment that fosters enhanced problem-solving skills. Empirical results across 500 000 episodes show substantial improvements in solve rates, increasing from 6% to 86% for sudoku puzzles and 7% to 79% for scheduling problems, alongside significant reductions in the average steps required to solve each problem. The proposed adaptive environment generation demonstrates the potential of CLP in advancing RL agents’ continual learning capabilities by dynamically regulating complexity, thus improving their adaptability and learning efficiency. This framework contributes to the broader fields of reinforcement learning and procedural content generation by introducing an innovative approach to continual adaptation in complex environments.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"540-553"},"PeriodicalIF":5.0,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yu-Jia Chen;Wei Chen;Sai Qian Zhang;Hai-Yan Huang;H.T. Kung
{"title":"A Task-Oriented Deep Learning Approach for Human Localization","authors":"Yu-Jia Chen;Wei Chen;Sai Qian Zhang;Hai-Yan Huang;H.T. Kung","doi":"10.1109/TCDS.2024.3485886","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3485886","url":null,"abstract":"Radio-based human sensing has attracted substantial research attention due to its wide range of applications, including e-healthcare monitoring, indoor security, and industrial surveillance. However, most existing studies rely on fixed receivers to capture wireless signal perturbations. This article introduces UH-Sense, the first human sensing system using an unmanned aerial vehicle (UAV) equipped with an omnidirectional antenna to measure signal strength from surrounding WiFi access points (APs). UH-Sense addresses the challenge of multisource UAV-induced noise with a novel data-driven learning-based approach that denoises corrupted data without prior knowledge of noise characteristics. Furthermore, we develop a localization model based on radio tomography imaging (RTI) that localizes humans without collecting the fingerprint database. We demonstrate that UH-Sense is readily deployable on commodity platforms and evaluate its performance in different real-world environments including irregular AP deployment and nonline-of-sight (NLOS) scenarios. Experimental results show that UH-Sense achieves a high detection performance with an average F1 score of 0.93 and yields similar or even better localization performance than that of using clean data (i.e., data collected at a fixed receiver), which has not been achieved by any of the state-of-the-art denoising methods.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"525-539"},"PeriodicalIF":5.0,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhiwei Song;Xiang Zhang;Shuhang Chen;Jieyuan Tan;Yiwen Wang
{"title":"Kernel-Based Actor–Critic Learning Framework for Autonomous Brain Control on Trajectory","authors":"Zhiwei Song;Xiang Zhang;Shuhang Chen;Jieyuan Tan;Yiwen Wang","doi":"10.1109/TCDS.2024.3485078","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3485078","url":null,"abstract":"Reinforcement learning (RL)-based brain–machine interfaces (BMIs) hold promise for restoring motor functions in paralyzed individuals. These interfaces interpret neural activity to control external devices through trial-and-error. In brain control (BC) tasks, subjects control the device continuously moving in space by imagining their own limb movement, in which the subject can change direction at any position before reaching the target. Such multistep BC tasks span a large space both in neural state and over a sequence of movements. However, conventional RL decoders face challenges in efficient exploration and limited guidance from delayed rewards. In this article, we propose a kernel-based actor–critic learning framework for multistep BC tasks. Our framework integrates continuous trajectory control (actor) and internal continuous state value estimation (critic) from medial prefrontal cortex (mPFC) activity. We evaluate our algorithm's performance in a BC three-lever discrimination task using data from two rats, comparing it to a kernel RL decoder with internal binary rewards and delayed external rewards. Experimental results show that our approach achieves faster convergence, shorter target-acquisition time, and shorter distances to targets. These findings highlight the potential of our algorithm for clinical applications in multistep BC tasks.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"554-563"},"PeriodicalIF":5.0,"publicationDate":"2024-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huayang Wu;Chengzhi Zhu;Long Cheng;Chenguang Yang;Yanan Li
{"title":"Simultaneous Estimation of Human Motion Intention and Time-Varying Arm Stiffness for Enhanced Human–Robot Interaction","authors":"Huayang Wu;Chengzhi Zhu;Long Cheng;Chenguang Yang;Yanan Li","doi":"10.1109/TCDS.2024.3480854","DOIUrl":"https://doi.org/10.1109/TCDS.2024.3480854","url":null,"abstract":"Recent advances in physiological human motor control research indicate that human endpoint stiffness magnitude increases linearly with grasp force. Based on these findings, a scheme was proposed in this article to integrate the linear quadratic estimation (LQE) filter with the stiffness model inferred from grasp force, which can simultaneously estimate the human arm's stiffness and motion intention. Then, an online variable impedance controller (VIC) was designed based on these estimations for physical human–robot interaction (pHRI). The proposed stiffness model and estimation method were validated through experiments using a planar robotic interface. To assess its performance in practical pHRI tasks, the implementation of human arm stiffness and intention estimation combining with VIC was extended to teleoperation peg-in-hole and robot-assisted rehabilitation tasks. The experimental results demonstrate that the proposed method can effectively estimate human motion intention and arm stiffness simultaneously. Compared to existing methods, the proposed VIC enhances pHRI in terms of increased flexibility, effective guidance, and reduced human effort.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 3","pages":"510-524"},"PeriodicalIF":5.0,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144213667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}