Xiaofei Zhang, Zhengping Fan, Ying Shen, Yining Li, Yasong An, Xiaojun Tan
{"title":"MAEMOT: Pretrained MAE-Based Antiocclusion 3-D Multiobject Tracking for Autonomous Driving.","authors":"Xiaofei Zhang, Zhengping Fan, Ying Shen, Yining Li, Yasong An, Xiaojun Tan","doi":"10.1109/TNNLS.2024.3480148","DOIUrl":"10.1109/TNNLS.2024.3480148","url":null,"abstract":"<p><p>The existing 3-D multiobject tracking (MOT) methods suffer from object occlusion in real-world traffic scenes. However, previous works have faced challenges in providing a reasonable solution to the fundamental question: \"How can the interference of the perception data loss caused by occlusion be overcome?\" Therefore, this article attempts to provide a reasonable solution by developing a novel pretrained movement-constrained masked autoencoder (M-MAE) for an antiocclusion 3-D MOT called MAEMOT. Specifically, for the pretrained M-MAE, this article adopts an efficient multistage transformer (MST) encoder and a spatiotemporal-based motion decoder to predict and reconstruct occluded point cloud data, following the properties of object motion. Afterward, the well-trained M-MAE model extracts the global features of occluded objects, ensuring that the features of the intraobjects between interframes are as consistent as possible throughout the spatiotemporal sequence. Next, a proposal-based geometric graph aggregation (PG <sup>2</sup> A) module is utilized to extract and fuse the spatial features of each proposal, producing refined region-of-interest (RoI) components. Finally, this article designs an object association module that combines geometric and corner affinities, which helps to match the predicted occlusion objects more robustly. According to an extensive evaluation, the proposed MAEMOT method can effectively overcome the interference of occlusion and achieve improved 3-D MOT performance under challenging conditions.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.2,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142557730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Spike-and-Slab Shrinkage Priors for Structurally Sparse Bayesian Neural Networks.","authors":"Sanket Jantre, Shrijita Bhattacharya, Tapabrata Maiti","doi":"10.1109/TNNLS.2024.3485529","DOIUrl":"10.1109/TNNLS.2024.3485529","url":null,"abstract":"<p><p>Network complexity and computational efficiency have become increasingly significant aspects of deep learning. Sparse deep learning addresses these challenges by recovering a sparse representation of the underlying target function by reducing heavily overparameterized deep neural networks. Specifically, deep neural architectures compressed via structured sparsity (e.g., node sparsity) provide low-latency inference, higher data throughput, and reduced energy consumption. In this article, we explore two well-established shrinkage techniques, Lasso and Horseshoe, for model compression in Bayesian neural networks (BNNs). To this end, we propose structurally sparse BNNs, which systematically prune excessive nodes with the following: 1) spike-and-slab group Lasso (SS-GL) and 2) SS group Horseshoe (SS-GHS) priors, and develop computationally tractable variational inference, including continuous relaxation of Bernoulli variables. We establish the contraction rates of the variational posterior of our proposed models as a function of the network topology, layerwise node cardinalities, and bounds on the network weights. We empirically demonstrate the competitive performance of our models compared with the baseline models in prediction accuracy, model compression, and inference latency.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.2,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142557731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-Task Multi-Agent Reinforcement Learning With Interaction and Task Representations.","authors":"Chao Li, Shaokang Dong, Shangdong Yang, Yujing Hu, Tianyu Ding, Wenbin Li, Yang Gao","doi":"10.1109/TNNLS.2024.3475216","DOIUrl":"https://doi.org/10.1109/TNNLS.2024.3475216","url":null,"abstract":"<p><p>Multi-task multi-agent reinforcement learning (MT-MARL) is capable of leveraging useful knowledge across multiple related tasks to improve performance on any single task. While recent studies have tentatively achieved this by learning independent policies on a shared representation space, we pinpoint that further advancements can be realized by explicitly characterizing agent interactions within these multi-agent tasks and identifying task relations for selective reuse. To this end, this article proposes Representing Interactions and Tasks (RIT), a novel MT-MARL algorithm that characterizes both intra-task agent interactions and inter-task task relations. Specifically, for characterizing agent interactions, RIT presents the interactive value decomposition to explicitly take the dependency among agents into policy learning. Theoretical analysis demonstrates that the learned utility value of each agent approximates its Shapley value, thus representing agent interactions. Moreover, we learn task representations based on per-agent local trajectories, which assess task similarities and accordingly identify task relations. As a result, RIT facilitates the effective transfer of interaction knowledge across similar multi-agent tasks. Structurally, RIT develops universal policy structure for scalable multi-task policy learning. We evaluate RIT against multiple state-of-the-art baselines in various cooperative tasks, and its significant performance under both multi-task and zero-shot settings demonstrates its effectiveness.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.2,"publicationDate":"2024-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142545189","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Neural Networks and Learning Systems Publication Information","authors":"","doi":"10.1109/TNNLS.2024.3478713","DOIUrl":"10.1109/TNNLS.2024.3478713","url":null,"abstract":"","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"35 11","pages":"C2-C2"},"PeriodicalIF":10.2,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10737918","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142541235","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Computational Intelligence Society Information","authors":"","doi":"10.1109/TNNLS.2024.3478709","DOIUrl":"https://doi.org/10.1109/TNNLS.2024.3478709","url":null,"abstract":"","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"35 11","pages":"C3-C3"},"PeriodicalIF":10.2,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10737997","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142540397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"IEEE Transactions on Neural Networks and Learning Systems Information for Authors","authors":"","doi":"10.1109/TNNLS.2024.3478711","DOIUrl":"https://doi.org/10.1109/TNNLS.2024.3478711","url":null,"abstract":"","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"35 11","pages":"C4-C4"},"PeriodicalIF":10.2,"publicationDate":"2024-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10737912","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142540434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Representation Learning Based on Co-Evolutionary Combined With Probability Distribution Optimization for Precise Defect Location.","authors":"Jinglin Zhang, Zekai Zhang, Qinghui Chen, Gang Li, Weiyu Li, Shijiao Ding, Maomao Xiong, Wenhao Zhang, Shengyong Chen","doi":"10.1109/TNNLS.2024.3479734","DOIUrl":"https://doi.org/10.1109/TNNLS.2024.3479734","url":null,"abstract":"<p><p>Visual defect detection methods based on representation learning play an important role in industrial scenarios. Defect detection technology based on representation learning has made significant progress. However, existing defect detection methods still face three challenges: first, the extreme scarcity of industrial defect samples makes training difficult. Second, due to the characteristics of industrial defects, such as blur and background interference, it is challenging to obtain fuzzy defect separation edges and context information. Third, industrial defects cannot obtain accurate positioning information. This article proposes feature co-evolution interaction architecture (CIA) and glass container defect dataset to address the above challenges. Specifically, the contributions of this article are as follows: first, this article designs a glass container image acquisition system that combines RGB and polarization information to create a glass container defect dataset containing more than 60 000 samples to alleviate the sample scarcity problem in industrial scenarios. Subsequently, this article designs the CIA. CIA optimizes the probability distribution of features through the co-evolution of edge and context features, thereby improving detection accuracy in blurred defects and noisy environments. Finally, this article proposes a novel inforced IoU loss (IIoU loss), which can obtain more accurate position information by being aware of the scale changes of the predicted box. Defect detection experiments in three mainstream industrial manufacturing categories (Northeastern University (NEU)-Det, glass containers, wood) show that CIA only uses 22.5 GFLOPs, and mean average precision (mAP) (NEU-Det: 88.74%, glass containers: 95.38%, wood: 68.42%) outperforms state-of-the-art methods.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.2,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142521808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scale-Aware Super-Resolution Network With Dual Affinity Learning for Lesion Segmentation From Medical Images.","authors":"Luyang Luo, Yanwen Li, Zhizhong Chai, Huangjing Lin, Pheng-Ann Heng, Hao Chen","doi":"10.1109/TNNLS.2024.3477947","DOIUrl":"https://doi.org/10.1109/TNNLS.2024.3477947","url":null,"abstract":"<p><p>Convolutional neural networks (CNNs) have shown remarkable progress in medical image segmentation. However, the lesion segmentation remains a challenge to state-of-the-art CNN-based algorithms due to the variance in scales and shapes. On the one hand, tiny lesions are hard to delineate precisely from the medical images which are often of low resolutions. On the other hand, segmenting large-size lesions requires large receptive fields, which exacerbates the first challenge. In this article, we present a scale-aware super-resolution (SR) network to adaptively segment lesions of various sizes from low-resolution (LR) medical images. Our proposed network contains dual branches to simultaneously conduct lesion mask SR (LMSR) and lesion image SR (LISR). Meanwhile, we introduce scale-aware dilated convolution (SDC) blocks into the multitask decoders to adaptively adjust the receptive fields of the convolutional kernels according to the lesion sizes. To guide the segmentation branch to learn from richer high-resolution (HR) features, we propose a feature affinity (FA) module and a scale affinity (SA) module to enhance the multitask learning of the dual branches. On multiple challenging lesion segmentation datasets, our proposed network achieved consistent improvements compared with other state-of-the-art methods. Code will be available at: https://github.com/poiuohke/SASR_Net.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.2,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142521809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Weighted Hermite Variable Projection Networks for Classifying Visually Evoked Potentials.","authors":"Tamas Dozsa, Carl Bock, Jens Meier, Peter Kovacs","doi":"10.1109/TNNLS.2024.3475271","DOIUrl":"https://doi.org/10.1109/TNNLS.2024.3475271","url":null,"abstract":"<p><p>The occipital cortex responds to visual stimuli regardless of a patient's level of consciousness or attention, offering a noninvasive diagnostic tool for both ophthalmologists and neurologists. This response signal manifests as a unique waveform referred to as the visually evoked potential (VEP), which can be extracted from the electroencephalogram (EEG) activity of a human being. We propose a trainable VEP representation to disentangle the underlying explanatory factors of the data. To enhance the learning process with domain knowledge, we present an innovative parameterization of classical Hermite functions that effectively captures VEP pattern variations arising from patient-specific factors, disorders, and measurement setup influences. Then, we introduce a differentiable variable projection (VP) layer to fuse Hermite basis function expansions (BFEs) of VEP signals with machine learning (ML) approaches. We prove the existence of an optimal set of parameters in the least-squares sense, assess the representation power of such layers, and calculate their analytical derivatives, which allows us to utilize backpropagation for training. Finally, we evaluate the effectiveness of the proposed learning framework in VEP-based color classification. To achieve this, we have designed a novel measurement system dedicated to intraoperative clinical use cases, which presents new ways for patient monitoring during neurosurgical procedures.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.2,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142521810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DSE-Based Hardware Trojan Attack for Neural Network Accelerators on FPGAs.","authors":"Chao Guo, Masao Yanagisawa, Youhua Shi","doi":"10.1109/TNNLS.2024.3482364","DOIUrl":"https://doi.org/10.1109/TNNLS.2024.3482364","url":null,"abstract":"<p><p>Over the past few years, the emergence and development of design space exploration (DSE) have shortened the deployment cycle of deep neural networks (DNNs). As a result, with these open-sourced DSE, we can automatically compute the optimal configuration and generate the corresponding accelerator intellectual properties (IPs) from the pretrained neural network models and hardware constraints. However, to date, the security of DSE has received little attention. Therefore, we explore this issue from an adversarial perspective and propose an automated hardware Trojan (HT) generation framework embedded within DSE. The framework uses an evolutionary algorithm (EA) to analyze user-input data to automatically generate the attack code before placing it in the final output accelerator IPs. The proposed HT is sufficiently stealthy and suitable for both single and multifield-programmable gate array (FPGA) designs. It can also implement controlled accuracy degradation attacks and specified category attacks. We conducted experiments on LeNet, VGG-16, and YOLO, respectively, and found that for the LeNet model trained on the CIFAR-10 dataset, attacking only one kernel resulted in 97.3% of images being classified in the category specified by the adversary and reduced accuracy by 59.58%. Moreover, for the VGG-16 model trained on the ImageNet dataset, attacking eight kernels can cause up to 96.53% of the images to be classified into the category specified by the adversary and causes the model's accuracy to decrease to 2.5%. Finally, for the YOLO model trained on the PASCAL VOC dataset, attacking with eight kernels can cause the model to identify the target as the specified category and cause slight perturbations to the bounding boxes. Compared to the un-compromised designs, the look-up tables (LUTs) overhead of the proposed HT design does not exceed 0.6%.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.2,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142521807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}