IEEE Transactions on Cognitive and Developmental Systems最新文献

筛选
英文 中文
A Two-Stage Foveal Vision Tracker Based on Transformer Model 基于变压器模型的两级眼窝视觉跟踪器
IF 5 3区 计算机科学
IEEE Transactions on Cognitive and Developmental Systems Pub Date : 2024-03-18 DOI: 10.1109/TCDS.2024.3377642
Guang Han;Jianshu Ma;Ziyang Li;Haitao Zhao
{"title":"A Two-Stage Foveal Vision Tracker Based on Transformer Model","authors":"Guang Han;Jianshu Ma;Ziyang Li;Haitao Zhao","doi":"10.1109/TCDS.2024.3377642","DOIUrl":"10.1109/TCDS.2024.3377642","url":null,"abstract":"With the development of transformer visual models, attention-based trackers have shown highly competitive performance in the field of object tracking. However, in some tracking scenarios, especially those with multiple similar objects, the performance of existing trackers is often not satisfactory. In order to improve the performance of trackers in such scenarios, inspired by the fovea vision structure and its visual characteristics, this article proposes a novel foveal vision tracker (FVT). FVT combines the process of human eye fixation and object tracking, pruning based on the distance to the object rather than attention scores. This pruning method allows the receptive field of the feature extraction network to focus on the object, excluding background interference. FVT divides the feature extraction network into two stages: local and global, and introduces the local recursive module (LRM) and the view elimination module (VEM). LRM is used to enhance foreground features in the local stage, while VEM generates circular fovea-like visual field masks in the global stage and prunes tokens outside the mask, guiding the model to focus attention on high-information regions of the object. Experimental results on multiple object tracking datasets demonstrate that the proposed FVT achieves stronger object discrimination capability in the feature extraction stage, improves tracking accuracy and robustness in complex scenes, and achieves a significant accuracy improvement with an area overlap (AO) of 72.6% on the generic object tracking (GOT)-10k dataset.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 4","pages":"1575-1588"},"PeriodicalIF":5.0,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140166867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Converting Artificial Neural Networks to Ultralow-Latency Spiking Neural Networks for Action Recognition 将人工神经网络转换为超低延迟尖峰神经网络以进行动作识别
IF 5 3区 计算机科学
IEEE Transactions on Cognitive and Developmental Systems Pub Date : 2024-03-14 DOI: 10.1109/TCDS.2024.3375620
Hong You;Xian Zhong;Wenxuan Liu;Qi Wei;Wenxin Huang;Zhaofei Yu;Tiejun Huang
{"title":"Converting Artificial Neural Networks to Ultralow-Latency Spiking Neural Networks for Action Recognition","authors":"Hong You;Xian Zhong;Wenxuan Liu;Qi Wei;Wenxin Huang;Zhaofei Yu;Tiejun Huang","doi":"10.1109/TCDS.2024.3375620","DOIUrl":"10.1109/TCDS.2024.3375620","url":null,"abstract":"Spiking neural networks (SNNs) have garnered significant attention for their potential in ultralow-power event-driven neuromorphic hardware implementations. One effective strategy for obtaining SNNs involves the conversion of artificial neural networks (ANNs) to SNNs. However, existing research on ANN–SNN conversion has predominantly focused on image classification task, leaving the exploration of action recognition task limited. In this article, we investigate the performance degradation of SNNs on action recognition task. Through in-depth analysis, we propose a framework called scalable dual threshold mapping (SDM) that effectively overcomes three types of conversion errors. By effectively mitigating these conversion errors, we are able to reduce the time required for the spike firing rate of SNNs to align with the activation values of ANNs. Consequently, our method enables the generation of accurate and ultralow-latency SNNs. We conduct extensive evaluations on multiple action recognition datasets, including University of Central Florida (UCF)-101 and Human Motion DataBase (HMDB)-51. Through rigorous experiments and analysis, we demonstrate the effectiveness of our approach. Notably, SDM achieves a remarkable Top-1 accuracy of 92.94% on UCF-101 while requiring ultralow latency (four time steps), highlighting its high performance with reduced computational requirements.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 4","pages":"1533-1545"},"PeriodicalIF":5.0,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140153797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EEG-Based Auditory Attention Detection With Spiking Graph Convolutional Network 利用尖峰图卷积网络进行基于脑电图的听觉注意力检测
IF 5 3区 计算机科学
IEEE Transactions on Cognitive and Developmental Systems Pub Date : 2024-03-12 DOI: 10.1109/TCDS.2024.3376433
Siqi Cai;Ran Zhang;Malu Zhang;Jibin Wu;Haizhou Li
{"title":"EEG-Based Auditory Attention Detection With Spiking Graph Convolutional Network","authors":"Siqi Cai;Ran Zhang;Malu Zhang;Jibin Wu;Haizhou Li","doi":"10.1109/TCDS.2024.3376433","DOIUrl":"10.1109/TCDS.2024.3376433","url":null,"abstract":"Decoding auditory attention from brain activities, such as electroencephalography (EEG), sheds light on solving the machine cocktail party problem. However, effective representation of EEG signals remains a challenge. One of the reasons is that the current feature extraction techniques have not fully exploited the spatial information along the EEG signals. EEG signals reflect the collective dynamics of brain activities across different regions. The intricate interactions among these channels, rather than individual EEG channels alone, reflect the distinctive features of brain activities. In this study, we propose a spiking graph convolutional network (SGCN), which captures the spatial features of multichannel EEG in a biologically plausible manner. Comprehensive experiments were conducted on two publicly available datasets. Results demonstrate that the proposed SGCN achieves competitive auditory attention detection (AAD) performance in low-latency and low-density EEG settings. As it features low power consumption, the SGCN has the potential for practical implementation in intelligent hearing aids and other brain–computer interfaces (BCIs).","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 5","pages":"1698-1706"},"PeriodicalIF":5.0,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140115810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Small Object Detection Based on Microscale Perception and Enhancement-Location Feature Pyramid 基于微尺度感知和增强的小物体检测--位置特征金字塔
IF 5 3区 计算机科学
IEEE Transactions on Cognitive and Developmental Systems Pub Date : 2024-03-07 DOI: 10.1109/TCDS.2024.3397684
Guang Han;Chenwei Guo;Ziyang Li;Haitao Zhao
{"title":"Small Object Detection Based on Microscale Perception and Enhancement-Location Feature Pyramid","authors":"Guang Han;Chenwei Guo;Ziyang Li;Haitao Zhao","doi":"10.1109/TCDS.2024.3397684","DOIUrl":"10.1109/TCDS.2024.3397684","url":null,"abstract":"Due to the large number of small objects, significant scale variation, and uneven distribution in images captured by unmanned aerial vehicles (UAVs), existing algorithms have high rates of missing and false detections of small objects in drone images. A new object detection algorithm based on microscale perception and enhancement-location feature pyramid is proposed in this article. The microscale perception module alternatives the original convolution module in backbone, changing the receptive field through two dilation branches with various dilation rates and an adjustment switch branch. To better match the size and shape of sampled targets, the weighted deformable convolution is employed. The enhancement-location feature pyramid module aggregates the features from each layer to obtain balanced semantic information and refines aggregated features to enhance their ability to represent features. Moreover, a bottom-up branch structure is added to utilize the property of lower layer features being beneficial to locating small objects to enhance the localization ability for small objects. Additionally, by using specific image cropping and combining techniques, the target distribution of the training data is altered to make the model more sensitive to small objects and improving its robustness. Finally, a sample balance strategy is used in combination with focal loss and a sample extraction control method to balance simple hard sample imbalance and the long-tail distribution of interclass sample imbalance during training. Experimental results show that the proposed algorithm achieves a mean average precision of 35.9% on the VisDrone2019 dataset, which is a 14.2% improvement over the baseline Cascade RCNN and demonstrates better performance in detecting small objects in drone images. Compared with advanced algorithms in recent years, it also achieves state-of-the-art detection accuracy.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 6","pages":"1982-1996"},"PeriodicalIF":5.0,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140940720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LITE-SNN: Leveraging Inherent Dynamics to Train Energy-Efficient Spiking Neural Networks for Sequential Learning LITE-SNN:利用固有动态性训练高能效尖峰神经网络以进行序列学习
IF 5 3区 计算机科学
IEEE Transactions on Cognitive and Developmental Systems Pub Date : 2024-03-03 DOI: 10.1109/TCDS.2024.3396431
Nitin Rathi;Kaushik Roy
{"title":"LITE-SNN: Leveraging Inherent Dynamics to Train Energy-Efficient Spiking Neural Networks for Sequential Learning","authors":"Nitin Rathi;Kaushik Roy","doi":"10.1109/TCDS.2024.3396431","DOIUrl":"10.1109/TCDS.2024.3396431","url":null,"abstract":"Spiking neural networks (SNNs) are gaining popularity for their promise of low-power machine intelligence on event-driven neuromorphic hardware. SNNs have achieved comparable performance as artificial neural networks (ANNs) on static tasks (image classification) with lower compute energy. In this work, we explore the inherent dynamics of SNNs for sequential tasks such as gesture recognition, sentiment analysis, and sequence-to-sequence learning on data from dynamic vision sensors (DVSs) and natural language processing (NLP). Sequential data are generally processed with complex recurrent neural networks (RNNs) [long short-term memory/gated recurrent unit (LSTM/GRU)] with explicit feedback connections and internal states to handle the long-term dependencies. The neuron models in SNNs—integrate-and-fire (IF) or leaky-integrate-and-fire (LIF)—have internal states (membrane potential) that can be efficiently leveraged for sequential tasks. The membrane potential in the IF/LIF neuron integrates the incoming current and outputs an event (or spike) when the potential crosses a threshold value. Since SNNs compute with highly sparse spike-based spatiotemporal data, the energy/inference is lower than LSTMs/GRUs. We also show that SNNs require fewer parameters than LSTM/GRU resulting in smaller models and faster inference. We observe the problem of vanishing gradients in vanilla SNNs for longer sequences and implement a convolutional SNN with attention layers to perform sequence-to-sequence learning tasks. The inherent recurrence in SNNs, in addition to the fully parallelized convolutional operations, provide additional mechanisms to model sequential dependencies that lead to better accuracy than convolutional neural networks (CNNs) with ReLU activations. We evaluate SNN on gesture recognition from the IBM DVS dataset, sentiment analysis from the IMDB movie reviews dataset, and German-to-English translation from the Multi30k dataset.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 6","pages":"1905-1914"},"PeriodicalIF":5.0,"publicationDate":"2024-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140829039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Unlearning for Seizure Prediction 用于癫痫发作预测的机器学习
IF 5 3区 计算机科学
IEEE Transactions on Cognitive and Developmental Systems Pub Date : 2024-03-01 DOI: 10.1109/TCDS.2024.3395663
Chenghao Shao;Chang Li;Rencheng Song;Xiang Liu;Ruobing Qian;Xun Chen
{"title":"Machine Unlearning for Seizure Prediction","authors":"Chenghao Shao;Chang Li;Rencheng Song;Xiang Liu;Ruobing Qian;Xun Chen","doi":"10.1109/TCDS.2024.3395663","DOIUrl":"10.1109/TCDS.2024.3395663","url":null,"abstract":"In recent years, companies and organizations have been required to provide individuals with the right to be forgotten to alleviate privacy concerns. In machine learning, this requires researchers not only to delete data from databases but also to remove data information from trained models. Thus, machine unlearning is becoming an emerging research problem. In seizure prediction field, prediction applications are established most on private electroencephalogram (EEG) signals. To provide the right to be forgotten, we propose a machine unlearning method for seizure prediction. Our proposed unlearning method is based on knowledge distillation using two teacher models to guide the student model toward achieving model-level unlearning objective. One teacher model is used to induce the student model to forget data information of patients with unlearning request (forgetting patients), while the other teacher model is used to enable the student model to retain data information of other patients (remaining patients). Experiments were conducted on CHBMIT and Kaggle databases. Results show that our proposed unlearning method can effectively make trained ML models forget the information of forgetting patients and maintain satisfactory performance on remaining patients. To the best of our knowledge, it is the first work of machine unlearning in seizure prediction field.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 6","pages":"1969-1981"},"PeriodicalIF":5.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140842228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust Perception-Based Visual Simultaneous Localization and Tracking in Dynamic Environments 动态环境中基于感知的稳健视觉同步定位与跟踪
IF 5 3区 计算机科学
IEEE Transactions on Cognitive and Developmental Systems Pub Date : 2024-02-28 DOI: 10.1109/TCDS.2024.3371073
Song Peng;Teng Ran;Liang Yuan;Jianbo Zhang;Wendong Xiao
{"title":"Robust Perception-Based Visual Simultaneous Localization and Tracking in Dynamic Environments","authors":"Song Peng;Teng Ran;Liang Yuan;Jianbo Zhang;Wendong Xiao","doi":"10.1109/TCDS.2024.3371073","DOIUrl":"10.1109/TCDS.2024.3371073","url":null,"abstract":"Visual simultaneous localization and mapping (SLAM) in dynamic scenes is a prerequisite for robot-related applications. Most of the existing SLAM algorithms mainly focus on dynamic object rejection, which makes part of the valuable information lost and prone to failure in complex environments. This article proposes a semantic visual SLAM system that incorporates rigid object tracking. A robust scene perception frame is designed, which gives autonomous robots the ability to perceive scenes similar to human cognition. Specifically, we propose a two-stage mask revision method to generate fine mask of the object. Based on the revised mask, we propose a semantic and geometric constraint (SAG) strategy, which provides a fast and robust way to perceive dynamic rigid objects. Then, the motion tracking of rigid objects is integrated into the SLAM pipeline, and a novel bundle adjustment is constructed to optimize camera localization and object six-degree of freedom (DoF) poses. Finally, the evaluation of the proposed algorithm is performed on publicly available KITTI dataset, Oxford Multimotion dataset, and real-world scenarios. The proposed algorithm achieves the comprehensive performance of \u0000<inline-formula><tex-math>$text{RPE}_{text{t}}$</tex-math></inline-formula>\u0000 less than 0.07 m per frame and \u0000<inline-formula><tex-math>$text{RPE}_{text{R}}$</tex-math></inline-formula>\u0000 about 0.03\u0000<inline-formula><tex-math>${}^{circ}$</tex-math></inline-formula>\u0000 per frame in the KITTI dataset. The experimental results reveal that the proposed algorithm enables accurate localization and robust tracking than state-of-the-art SLAM algorithms in challenging dynamic scenarios.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 4","pages":"1507-1520"},"PeriodicalIF":5.0,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140002820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Brain Connectivity Analysis for EEG-Based Face Perception Task 基于脑电图的人脸感知任务的大脑连接性分析
IF 5 3区 计算机科学
IEEE Transactions on Cognitive and Developmental Systems Pub Date : 2024-02-27 DOI: 10.1109/TCDS.2024.3370635
Debashis Das Chakladar;Nikhil R. Pal
{"title":"Brain Connectivity Analysis for EEG-Based Face Perception Task","authors":"Debashis Das Chakladar;Nikhil R. Pal","doi":"10.1109/TCDS.2024.3370635","DOIUrl":"10.1109/TCDS.2024.3370635","url":null,"abstract":"Face perception is considered a highly developed visual recognition skill in human beings. Most face perception studies used functional magnetic resonance imaging to identify different brain cortices related to face perception. However, studying brain connectivity networks for face perception using electroencephalography (EEG) has not yet been done. In the proposed framework, initially, a correlation-tree traversal-based channel selection algorithm is developed to identify the “optimum” EEG channels by removing the highly correlated EEG channels from the input channel set. Next, the effective brain connectivity network among those “optimum” EEG channels is developed using multivariate transfer entropy (TE) while participants watched different face stimuli (i.e., famous, unfamiliar, and scrambled). We transform EEG channels into corresponding brain regions for generalization purposes and identify the active brain regions for each face stimulus. To find the stimuluswise brain dynamics, the information transfer among the identified brain regions is estimated using several graphical measures [global efficiency (GE) and transitivity]. Our model archives the mean GE of 0.800, 0.695, and 0.581 for famous, unfamiliar, and scrambled faces, respectively. Identifying face perception-specific brain regions will enhance understanding of the EEG-based face-processing system. Understanding the brain networks of famous, unfamiliar, and scrambled faces can be useful in criminal investigation applications.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 4","pages":"1494-1506"},"PeriodicalIF":5.0,"publicationDate":"2024-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140002461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
D-FaST: Cognitive Signal Decoding With Disentangled Frequency–Spatial–Temporal Attention D-FaST:频率-空间-时间注意力分离的认知信号解码
IF 5 3区 计算机科学
IEEE Transactions on Cognitive and Developmental Systems Pub Date : 2024-02-26 DOI: 10.1109/TCDS.2024.3370261
WeiGuo Chen;Changjian Wang;Kele Xu;Yuan Yuan;Yanru Bai;Dongsong Zhang
{"title":"D-FaST: Cognitive Signal Decoding With Disentangled Frequency–Spatial–Temporal Attention","authors":"WeiGuo Chen;Changjian Wang;Kele Xu;Yuan Yuan;Yanru Bai;Dongsong Zhang","doi":"10.1109/TCDS.2024.3370261","DOIUrl":"10.1109/TCDS.2024.3370261","url":null,"abstract":"Cognitive language processing (CLP), situated at the intersection of natural language processing (NLP) and cognitive science, plays a progressively pivotal role in the domains of artificial intelligence, cognitive intelligence, and brain science. Among the essential areas of investigation in CLP, cognitive signal decoding (CSD) has made remarkable achievements, yet there still exist challenges related to insufficient global dynamic representation capability and deficiencies in multidomain feature integration. In this article, we introduce a novel paradigm for CLP referred to as disentangled frequency–spatial–temporal attention (D-FaST). Specifically, we present a novel cognitive signal decoder that operates on disentangled frequency–space–time domain attention. This decoder encompasses three key components: frequency domain feature extraction employing multiview attention (MVA), spatial domain feature extraction utilizing dynamic brain connection graph attention, and temporal feature extraction relying on local time sliding window attention. These components are integrated within a novel disentangled framework. Additionally, to encourage advancements in this field, we have created a new CLP dataset, MNRED. Subsequently, we conducted an extensive series of experiments, evaluating D-FaST's performance on MNRED, as well as on publicly available datasets including ZuCo, BCIC IV-2A, and BCIC IV-2B. Our experimental results demonstrate that D-FaST outperforms existing methods significantly on both our datasets and traditional CSD datasets including establishing a state-of-the-art accuracy score 78.72% on MNRED, pushing the accuracy score on ZuCo to 78.35%, accuracy score on BCIC IV-2A to 74.85%, and accuracy score on BCIC IV-2B to 76.81%.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 4","pages":"1476-1493"},"PeriodicalIF":5.0,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139979066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DTCM: Deep Transformer Capsule Mutual Distillation for Multivariate Time Series Classification DTCM:用于多变量时间序列分类的深度变压器胶囊互馏法
IF 5 3区 计算机科学
IEEE Transactions on Cognitive and Developmental Systems Pub Date : 2024-02-26 DOI: 10.1109/TCDS.2024.3370219
Zhiwen Xiao;Xin Xu;Huanlai Xing;Bowen Zhao;Xinhan Wang;Fuhong Song;Rong Qu;Li Feng
{"title":"DTCM: Deep Transformer Capsule Mutual Distillation for Multivariate Time Series Classification","authors":"Zhiwen Xiao;Xin Xu;Huanlai Xing;Bowen Zhao;Xinhan Wang;Fuhong Song;Rong Qu;Li Feng","doi":"10.1109/TCDS.2024.3370219","DOIUrl":"10.1109/TCDS.2024.3370219","url":null,"abstract":"This article proposes a dual-network-based feature extractor, perceptive capsule network (PCapN), for multivariate time series classification (MTSC), including a local feature network (LFN) and a global relation network (GRN). The LFN has two heads (i.e., Head_A and Head_B), each containing two squash convolutional neural network (CNN) blocks and one dynamic routing block to extract the local features from the data and mine the connections among them. The GRN consists of two capsule-based transformer blocks and one dynamic routing block to capture the global patterns of each variable and correlate the useful information of multiple variables. Unfortunately, it is difficult to directly deploy PCapN on mobile devices due to its strict requirement for computing resources. So, this article designs a lightweight capsule network (LCapN) to mimic the cumbersome PCapN. To promote knowledge transfer from PCapN to LCapN, this article proposes a deep transformer capsule mutual (DTCM) distillation method. It is targeted and offline, using one- and two-way operations to supervise the knowledge distillation (KD) process for the dual-network-based student and teacher models. Experimental results show that the proposed PCapN and DTCM achieve excellent performance on University of East Anglia 2018 (UEA2018) datasets regarding top-1 accuracy.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"16 4","pages":"1445-1461"},"PeriodicalIF":5.0,"publicationDate":"2024-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139979417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信