Diego Stucchi;Luca Magri;Diego Carrera;Giacomo Boracchi
{"title":"Multimodal Batch-Wise Change Detection","authors":"Diego Stucchi;Luca Magri;Diego Carrera;Giacomo Boracchi","doi":"10.1109/TNNLS.2023.3294846","DOIUrl":"10.1109/TNNLS.2023.3294846","url":null,"abstract":"We address the problem of detecting distribution changes in a novel batch-wise and multimodal setup. This setup is characterized by a stationary condition where batches are drawn from potentially different modalities among a set of distributions in \u0000<inline-formula> <tex-math>$mathbb {R}^{d}$ </tex-math></inline-formula>\u0000 represented in the training set. Existing change detection (CD) algorithms assume that there is a unique—possibly multipeaked—distribution characterizing stationary conditions, and in batch-wise multimodal context exhibit either low detection power or poor control of false positives. We present MultiModal QuantTree (MMQT), a novel CD algorithm that uses a single histogram to model the batch-wise multimodal stationary conditions. During testing, MMQT automatically identifies which modality has generated the incoming batch and detects changes by means of a modality-specific statistic. We leverage the theoretical properties of QuantTree to: 1) automatically estimate the number of modalities in a training set and 2) derive a principled calibration procedure that guarantees false-positive control. Our experiments show that MMQT achieves high detection power and accurate control over false positives in synthetic and real-world multimodal CD problems. Moreover, we show the potential of MMQT in Stream Learning applications, where it proves effective at detecting concept drifts and the emergence of novel classes by solely monitoring the input distribution.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"34 10","pages":"6783-6797"},"PeriodicalIF":10.4,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/iel7/5962385/10273172/10219143.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10006770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chi Huang, Wenjun Xiong, Jianquan Lu, Darong Huang
{"title":"Asymptotic Stability of Delayed Boolean Networks With Random Data Dropouts.","authors":"Chi Huang, Wenjun Xiong, Jianquan Lu, Darong Huang","doi":"10.1109/TNNLS.2023.3301220","DOIUrl":"10.1109/TNNLS.2023.3301220","url":null,"abstract":"<p><p>In real networks, communication constraints often prevent the full exchange of information between nodes, which is inevitable. This brief investigates the problem of time delay and randomly missing data in Boolean networks (BNs). A Bernoulli random variable is assigned to each node to characterize the probability of data packet dropout. Time delay and missing data are modeled by independent random variables. A novel data-sending rule that incorporates both communication constraints is proposed. An augmented system, comprising current states, delayed information, and successfully transmitted data, is established for theoretical analysis. Using the semitensor product (STP), the necessary and sufficient condition for asymptotic stability of delayed BNs with random data dropouts is derived. The convergence rate is also obtained.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10008743","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PSAQ-ViT V2: Toward Accurate and General Data-Free Quantization for Vision Transformers.","authors":"Zhikai Li, Mengjuan Chen, Junrui Xiao, Qingyi Gu","doi":"10.1109/TNNLS.2023.3301007","DOIUrl":"10.1109/TNNLS.2023.3301007","url":null,"abstract":"<p><p>Data-free quantization can potentially address data privacy and security concerns in model compression and thus has been widely investigated. Recently, patch similarity aware data-free quantization for vision transformers (PSAQ-ViT) designs a relative value metric, patch similarity, to generate data from pretrained vision transformers (ViTs), achieving the first attempt at data-free quantization for ViTs. In this article, we propose PSAQ-ViT V2, a more accurate and general data-free quantization framework for ViTs, built on top of PSAQ-ViT. More specifically, following the patch similarity metric in PSAQ-ViT, we introduce an adaptive teacher-student strategy, which facilitates the constant cyclic evolution of the generated samples and the quantized model in a competitive and interactive fashion under the supervision of the full-precision (FP) model (teacher), thus significantly improving the accuracy of the quantized model. Moreover, without the auxiliary category guidance, we employ the task-and model-independent prior information, making the general-purpose scheme compatible with a broad range of vision tasks and models. Extensive experiments are conducted on various models on image classification, object detection, and semantic segmentation tasks, and PSAQ-ViT V2, with the naive quantization strategy and without access to real-world data, consistently achieves competitive results, showing potential as a powerful baseline on data-free quantization for ViTs. For instance, with Swin-S as the (backbone) model, 8-bit quantization reaches 82.13 top-1 accuracy on ImageNet, 50.9 box AP and 44.1 mask AP on COCO, and 47.2 mean Intersection over Union (mIoU) on ADE20K. We hope that accurate and general PSAQ-ViT V2 can serve as a potential and practice solution in real-world applications involving sensitive data. Code is released and merged at: https://github.com/zkkli/PSAQ-ViT.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10000545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Multistage Information Complementary Fusion Network Based on Flexible-Mixup for HSI-X Image Classification.","authors":"Junjie Wang, Mengmeng Zhang, Wei Li, Ran Tao","doi":"10.1109/TNNLS.2023.3300903","DOIUrl":"10.1109/TNNLS.2023.3300903","url":null,"abstract":"<p><p>Mixup-based data augmentation has been proven to be beneficial to the regularization of models during training, especially in the remote-sensing field where the training data is scarce. However, in the process of data augmentation, the Mixup-based methods ignore the target proportion in different inputs and keep the linear insertion ratio consistent, which leads to the response of label space even if no effective objects are introduced in the mixed image due to the randomness of the augmentation process. Moreover, although some previous works have attempted to utilize different multimodal interaction strategies, they could not be well extended to various remote-sensing data combinations. To this end, a multistage information complementary fusion network based on flexible-mixup (Flex-MCFNet) is proposed for hyperspectral-X image classification. First, to bridge the gap between the mixed image and the label, a flexible-mixup (FlexMix) data augmentation strategy is designed, where the weight of the label increases with the ratio of the input image to prevent the negative impact on the label space because of the introduction of invalid information. More importantly, to summarize diverse remote-sensing data inputs including various modal supplements and uncertainties, a multistage information complementary fusion network (MCFNet) is developed. After extracting the features of hyperspectral and complementary modalities X-modal, including multispectral, synthetic aperture radar (SAR), and light detection and ranging (LiDAR) separately, the information between complementary modalities is fully interacted and enhanced through multiple stages of information complement and fusion, which is used for the final image classification. Extensive experimental results have demonstrated that Flex-MCFNet can not only effectively expand the training data, but also adequately regularize different data combinations to achieve state-of-the-art performance.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10002533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhonghong Ou, Zongzhi Han, Peihang Liu, Shengyu Teng, Meina Song
{"title":"SIIR: Symmetrical Information Interaction Modeling for News Recommendation.","authors":"Zhonghong Ou, Zongzhi Han, Peihang Liu, Shengyu Teng, Meina Song","doi":"10.1109/TNNLS.2023.3299790","DOIUrl":"10.1109/TNNLS.2023.3299790","url":null,"abstract":"<p><p>Accurate matching between user and candidate news plays a fundamental role in news recommendation. Most existing studies capture fine-grained user interests through effective user modeling. Nevertheless, user interest representations are often extracted from multiple history news items, while candidate news representations are learned from specific news items. The asymmetry of information density causes invalid matching of user interests and candidate news, which severely affects the click-through rate prediction for specific candidate news. To resolve the problems mentioned above, we propose a symmetrical information interaction modeling for news recommendation (SIIR) in this article. We first design a light interactive attention network for user (LIAU) modeling to extract user interests related to the candidate news and reduce interference of noise effectively. LIAU overcomes the shortcomings of complex structure and high training costs of conventional interaction-based models and makes full use of domain-specific interest tendencies of users. We then propose a novel heterogeneous graph neural network (HGNN) to enhance candidate news representation through the potential relations among news. HGNN builds a candidate news enhancement scheme without user interaction to further facilitate accurate matching with user interests, which mitigates the cold-start problem effectively. Experiments on two realistic news datasets, i.e., MIND and Adressa, demonstrate that SIIR outperforms the state-of-the-art (SOTA) single-model methods by a large margin.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10002529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Address the Unseen Relationships: Attribute Correlations in Text Attribute Person Search","authors":"Xi Yang;Xiaoqi Wang;Nannan Wang;Xinbo Gao","doi":"10.1109/TNNLS.2023.3300582","DOIUrl":"10.1109/TNNLS.2023.3300582","url":null,"abstract":"Text attribute person search aims to identify the particular pedestrian by textual attribute information. Compared to person re- identification tasks which requires imagery samples as its query, text attribute person search is more useful under the circumstance where only witness is available. Most existing text attribute person search methods focus on improving the matching correlation and alignments by learning better representations of person–attribute instance pairs, with few consideration of the latent correlations between attributes. In this work, we propose a graph convolutional network (GCN) and pseudo-label-based text attribute person search method. Concretely, the model directly constructs the attribute correlations by label co- occurrence probability, in which the nodes are represented by attribute embedding and edges are by the filtered correlation matrix of attribute labels. In order to obtain better representations, we combine the cross-attention module (CAM) and the GCN. Furthermore, to address the unseen attribute relationships, we update the edge information through the instances through testing set with high predicted probability thus to better adapt the attribute distribution. Extensive experiments illustrate that our model outperforms the existing state-of-the-art methods on publicly available person search benchmarks: Market-1501 and PETA.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"35 11","pages":"16916-16926"},"PeriodicalIF":10.2,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9970230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Attention-Driven Memory Network for Online Visual Tracking.","authors":"Huanlong Zhang, Jiamei Liang, Jiapeng Zhang, Tianzhu Zhang, Yingzi Lin, Yanfeng Wang","doi":"10.1109/TNNLS.2023.3299412","DOIUrl":"10.1109/TNNLS.2023.3299412","url":null,"abstract":"<p><p>A memory mechanism has attracted growing popularity in tracking tasks due to the ability of learning long-term-dependent information. However, it is very challenging for existing memory modules to provide the intrinsic attribute information of the target to the tracker in complex scenes. In this article, by considering the biological visual memory mechanisms, we propose the novel online tracking method via an attention-driven memory network, which can mine discriminative memory information and enhance the robustness and reliability of the tracker. First, to reinforce effectiveness of memory content, we design a novel attention-driven memory network. In the network, the long memory module gains property-level memory information by focusing on the state of the target at both the channel and spatial levels. Meanwhile, in reciprocity, we add a short-term memory module to maintain good adaptability when confronting drastic deformation of the target. The attention-driven memory network can adaptively adjust the contribution of short-term and long-term memories to tracking results under the weighted gradient harmonized loss. On this basis, to avoid model performance degradation, an online memory updater (MU) is further proposed. It is designed to mining for target information in tracking results through the Mixer layer and the online head network together. By evaluating the confidence of the tracking results, the memory updater can accurately judge the time of updating the model, which guarantees the effectiveness of online memory updates. Finally, the proposed method performs favorably and has been extensively validated on several benchmark datasets, including object tracking benchmark-50/100 (OTB-50/100), temple color-128 (TC-128), unmanned aerial vehicles-123 (UAV-123), generic object tracking -10k (GOT-10k), visual object tracking-2016 (VOT-2016), and VOT-2018 against several advanced methods.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9970232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Brain Network Classification for Accurate Detection of Alzheimer's Disease via Manifold Harmonic Discriminant Analysis.","authors":"Hongmin Cai, Xiaoqi Sheng, Guorong Wu, Bin Hu, Yiu-Ming Cheung, Jiazhou Chen","doi":"10.1109/TNNLS.2023.3301456","DOIUrl":"10.1109/TNNLS.2023.3301456","url":null,"abstract":"<p><p>Mounting evidence shows that Alzheimer's disease (AD) manifests the dysfunction of the brain network much earlier before the onset of clinical symptoms, making its early diagnosis possible. Current brain network analyses treat high-dimensional network data as a regular matrix or vector, which destroys the essential network topology, thereby seriously affecting diagnosis accuracy. In this context, harmonic waves provide a solid theoretical background for exploring brain network topology. However, the harmonic waves are originally intended to discover neurological disease propagation patterns in the brain, which makes it difficult to accommodate brain disease diagnosis with high heterogeneity. To address this challenge, this article proposes a network manifold harmonic discriminant analysis (MHDA) method for accurately detecting AD. Each brain network is regarded as an instance drawn on a Stiefel manifold. Every instance is represented by a set of orthonormal eigenvectors (i.e., harmonic waves) derived from its Laplacian matrix, which fully respects the topological structure of the brain network. An MHDA method within the Stiefel space is proposed to identify the group-dependent common harmonic waves, which can be used as group-specific references for downstream analyses. Extensive experiments are conducted to demonstrate the effectiveness of the proposed method in stratifying cognitively normal (CN) controls, mild cognitive impairment (MCI), and AD.</p>","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"PP ","pages":""},"PeriodicalIF":10.4,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10858979/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10118420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Niall Taylor;Yi Zhang;Dan W. Joyce;Ziming Gao;Andrey Kormilitzin;Alejo Nevado-Holgado
{"title":"Clinical Prompt Learning With Frozen Language Models","authors":"Niall Taylor;Yi Zhang;Dan W. Joyce;Ziming Gao;Andrey Kormilitzin;Alejo Nevado-Holgado","doi":"10.1109/TNNLS.2023.3294633","DOIUrl":"10.1109/TNNLS.2023.3294633","url":null,"abstract":"When the first transformer-based language models were published in the late 2010s, pretraining with general text and then fine-tuning the model on a task-specific dataset often achieved the state-of-the-art performance. However, more recent work suggests that for some tasks, directly prompting the pretrained model matches or surpasses fine-tuning in performance with few or no model parameter updates required. The use of prompts with language models for natural language processing (NLP) tasks is known as prompt learning. We investigated the viability of prompt learning on clinically meaningful decision tasks and directly compared this with more traditional fine-tuning methods. Results show that prompt learning methods were able to match or surpass the performance of traditional fine-tuning with up to 1000 times fewer trainable parameters, less training time, less training data, and lower computation resource requirements. We argue that these characteristics make prompt learning a very desirable alternative to traditional fine-tuning for clinical tasks, where the computational resources of public health providers are limited, and where data can often not be made available or not be used for fine-tuning due to patient privacy concerns. The complementary code to reproduce the experiments presented in this work can be found at \u0000<uri>https://github.com/NtaylorOX/Public_Clinical_Prompt</uri>\u0000.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"35 11","pages":"16453-16463"},"PeriodicalIF":10.2,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10036595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving Robustness of Intent Detection Under Adversarial Attacks: A Geometric Constraint Perspective","authors":"Biqing Qi;Bowen Zhou;Weinan Zhang;Jianxing Liu;Ligang Wu","doi":"10.1109/TNNLS.2023.3267460","DOIUrl":"10.1109/TNNLS.2023.3267460","url":null,"abstract":"Deep neural networks (DNNs)-based natural language processing (NLP) systems are vulnerable to being fooled by adversarial examples presented in recent studies. Intent detection tasks in dialog systems are no exception, however, relatively few works have been attempted on the defense side. The combination of linear classifier and softmax is widely used in most defense methods for other NLP tasks. Unfortunately, it does not encourage the model to learn well-separated feature representations. Thus, it is easy to induce adversarial examples. In this article, we propose a simple, yet efficient defense method from the geometric constraint perspective. Specifically, we first propose an M-similarity metric to shrink variances of intraclass features. Intuitively, better geometric conditions of feature space can bring lower misclassification probability (MP). Therefore, we derive the optimal geometric constraints of anchors within each category from the overall MP (OMP) with theoretical guarantees. Due to the nonconvex characteristic of the optimal geometric condition, it is hard to satisfy the traditional optimization process. To this end, we regard such geometric constraints as manifold optimization processes in the Stiefel manifold, thus naturally avoiding the above challenges. Experimental results demonstrate that our method can significantly improve robustness compared with baselines, while retaining the excellent performance on normal examples.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"35 5","pages":"6133-6144"},"PeriodicalIF":10.4,"publicationDate":"2023-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9970231","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}