Neural NetworksPub Date : 2025-02-27DOI: 10.1016/j.neunet.2025.107301
Soohyun Park , Hyunsoo Lee , Seok Bin Son , Soyi Jung , Joongheon Kim
{"title":"Quantum federated learning with pole-angle quantum local training and trainable measurement","authors":"Soohyun Park , Hyunsoo Lee , Seok Bin Son , Soyi Jung , Joongheon Kim","doi":"10.1016/j.neunet.2025.107301","DOIUrl":"10.1016/j.neunet.2025.107301","url":null,"abstract":"<div><div>Recently, quantum federated learning (QFL) has received significant attention as an innovative paradigm. QFL has remarkable features by employing quantum neural networks (QNNs) instead of conventional neural networks owing to quantum supremacy. In order to enhance the flexibility and reliability of classical QFL frameworks, this paper proposes a novel slimmable QFL (SlimQFL) incorporating QNN-grounded slimmable neural network (QSNN) architectures. This innovative design considers time-varying wireless communication channels and computing resource constraints. This framework ensures higher efficiency by using fewer parameters with no performance loss. Furthermore, the proposed QNN is novel according to the implementation of trainable measurement within QFL. The fundamental concept of our QSNN is designed based on the key characteristics of separated training and the dynamic exploitation of joint angle and pole parameters. Our performance evaluation results verify that using both parameters, our proposed QSNN-based SlimQFL achieves higher classification accuracy than QFL and ensures transmission stability, particularly in poor channel conditions.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107301"},"PeriodicalIF":6.0,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143519829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-02-27DOI: 10.1016/j.neunet.2025.107286
Man Shu , Shuai Lü , Xiaoyu Gong , Daolong An , Songlin Li
{"title":"Episodic Memory-Double Actor–Critic Twin Delayed Deep Deterministic Policy Gradient","authors":"Man Shu , Shuai Lü , Xiaoyu Gong , Daolong An , Songlin Li","doi":"10.1016/j.neunet.2025.107286","DOIUrl":"10.1016/j.neunet.2025.107286","url":null,"abstract":"<div><div>Existing deep reinforcement learning (DRL) algorithms suffer from the problem of low sample efficiency. Episodic memory allows DRL algorithms to remember and use past experiences with high return, thereby improving sample efficiency. However, due to the high dimensionality of the state–action space in continuous action tasks, previous methods in continuous action tasks often only utilize the information stored in episodic memory, rather than directly employing episodic memory for action selection as done in discrete action tasks. We suppose that episodic memory retains the potential to guide action selection in continuous control tasks. Our objective is to enhance sample efficiency by leveraging episodic memory for action selection in such tasks—either reducing the number of training steps required to achieve comparable performance or enabling the agent to obtain higher rewards within the same number of training steps. To this end, we propose an “Episodic Memory-Double Actor–Critic (EMDAC)” framework, which can use episodic memory for action selection in continuous action tasks. The critics and episodic memory evaluate the value of state–action pairs selected by the two actors to determine the final action. Meanwhile, we design an episodic memory based on a Kalman filter optimizer, which updates using the episodic rewards of collected state–action pairs. The Kalman filter optimizer assigns different weights to experiences collected at different time periods during the memory update process. In our episodic memory, state–action pair clusters are used as indices, recording both the occurrence frequency of these clusters and the value estimates for the corresponding state–action pairs. This enables the estimation of the value of state–action pair clusters by querying the episodic memory. After that, we design intrinsic reward based on the novelty of state–action pairs with episodic memory, defined by the occurrence frequency of state–action pair clusters, to enhance the exploration capability of the agent. Ultimately, we propose an “EMDAC-TD3” algorithm by applying this three modules to Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm within an Actor–Critic framework. Through evaluations in MuJoCo environments within the OpenAI Gym domain, EMDAC-TD3 achieves higher sample efficiency compared to baseline algorithms. EMDAC-TD3 demonstrates superior final performance compared to state-of-the-art episodic control algorithms and advanced Actor–Critic algorithms, by comparing the final rewards, Median, Interquartile Mean, Mean, and Optimality Gap. The final rewards can directly demonstrate the advantages of the algorithms. Based on the final rewards, EMDAC-TD3 achieves an average performance improvement of 11.01% over TD3, surpassing the current state-of-the-art algorithms in the same category.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107286"},"PeriodicalIF":6.0,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143552992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-02-27DOI: 10.1016/j.neunet.2025.107293
Yibin Tang , Jikang Ding , Ying Chen , Yuan Gao , Aimin Jiang , Chun Wang
{"title":"Anxiety disorder identification with biomarker detection through subspace-enhanced hypergraph neural network","authors":"Yibin Tang , Jikang Ding , Ying Chen , Yuan Gao , Aimin Jiang , Chun Wang","doi":"10.1016/j.neunet.2025.107293","DOIUrl":"10.1016/j.neunet.2025.107293","url":null,"abstract":"<div><div>In this study, we propose a subspace-enhanced hypergraph neural network (seHGNN) for classifying anxiety disorders (AD), which are prevalent mental illnesses that affect a significant portion of the global population. Our seHGNN model utilizes a learnable incidence matrix to strengthen the influence of hyperedges in graphs and enhance the feature extraction performance of hypergraph neural networks (HGNNs). Then, we integrate multimodal data on the brain limbic system into a hypergraph within an existing binary hypothesis testing framework. Experimental results demonstrate that our seHGNN achieves a remarkable accuracy of 84.46% for AD classification. By employing an ensemble learning strategy, we can further improve its performance, achieving a high accuracy of 94.1%. Our method outperforms other deep-learning-based methods, particularly GNN-based methods. Furthermore, our seHGNN successfully identifies discriminative AD biomarkers that align with existing reports, providing strong evidence supporting the effectiveness and interpretability of our proposed method.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107293"},"PeriodicalIF":6.0,"publicationDate":"2025-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143512452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-02-26DOI: 10.1016/j.neunet.2025.107297
Muneki Yasuda , Ryosuke Maeno , Chako Takahashi
{"title":"Dataset-free weight-initialization on restricted Boltzmann machine","authors":"Muneki Yasuda , Ryosuke Maeno , Chako Takahashi","doi":"10.1016/j.neunet.2025.107297","DOIUrl":"10.1016/j.neunet.2025.107297","url":null,"abstract":"<div><div>In feed-forward neural networks, dataset-free weight-initialization methods such as LeCun, Xavier (or Glorot), and He initializations have been developed. These methods randomly determine the initial values of weight parameters based on specific distributions (e.g., Gaussian or uniform distributions) without using training datasets. To the best of the authors’ knowledge, such a dataset-free weight-initialization method is yet to be developed for restricted Boltzmann machines (RBMs), which are probabilistic neural networks consisting of two layers. In this study, we derive a dataset-free weight-initialization method for Bernoulli–Bernoulli RBMs based on statistical mechanical analysis. In the proposed weight-initialization method, the weight parameters are drawn from a Gaussian distribution with zero mean. The standard deviation of the Gaussian distribution is optimized based on our hypothesis that a standard deviation providing a larger layer correlation (LC) between the two layers improves the learning efficiency. The expression of the LC is derived based on a statistical mechanical analysis. The optimal value of the standard deviation corresponds to the maximum point of the LC. The proposed weight-initialization method is identical to Xavier initialization in a specific case (i.e., when the sizes of the two layers are the same, the random variables of the layers are <span><math><mrow><mo>{</mo><mo>−</mo><mn>1</mn><mo>,</mo><mn>1</mn><mo>}</mo></mrow></math></span>-binary, and all bias parameters are zero). The validity of the proposed weight-initialization method is demonstrated in numerical experiments using a toy dataset and real-world datasets.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107297"},"PeriodicalIF":6.0,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143552994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-02-25DOI: 10.1016/j.neunet.2025.107291
Wangyu Jin , Huifang Ma , Yingyue Zhang , Zhixin Li , Liang Chang
{"title":"Dual-view graph-of-graph representation learning with graph Transformer for graph-level anomaly detection","authors":"Wangyu Jin , Huifang Ma , Yingyue Zhang , Zhixin Li , Liang Chang","doi":"10.1016/j.neunet.2025.107291","DOIUrl":"10.1016/j.neunet.2025.107291","url":null,"abstract":"<div><div>Graph-Level Anomaly Detection (GLAD) endeavors to pinpoint a small subset of anomalous graphs that deviate from the normal data distribution within a given set of graph data. Existing GLAD methods typically rely on Graph Neural Networks (GNNs) to extract graph-level representations, which are then used for the detection task. However, the inherent limited receptive field of GNNs may exclude crucial anomalous information embedded within the graph. Moreover, the inadequate modeling of cross-graph relationships limits the exploration of connections between different graphs, thus restricting the model’s ability to uncover inter-graph anomalous patterns. In this paper, we propose a novel approach called Dual-View Graph-of-Graph Representation Learning Network for unsupervised GLAD, which takes into account both intra-graph and inter-graph perspectives. Firstly, to enhance the capability of mining intra-graph information, we introduce a Graph Transformer that enhances the receptive field of the GNNs by considering both attribute and structural information. This augmentation enables a comprehensive exploration of the information encoded within the graph. Secondly, to explicitly capture the cross-graph dependencies, we devise a Graph-of-Graph-based dual-view representation learning network to explicitly capture cross-graph interdependencies. Attribute and structure-based graph-of-graph representations are induced, facilitating a comprehensive understanding of the relationships between graphs. Finally, we utilize anomaly scores from different perspectives to quantify the extent of anomalies present in each graph. This multi-perspective evaluation provides a more comprehensive assessment of anomalies within the graph data. Extensive experiments conducted on multiple benchmark datasets demonstrate the effectiveness of our proposed method in detecting anomalies within graph data.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107291"},"PeriodicalIF":6.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143519828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-02-25DOI: 10.1016/j.neunet.2025.107292
Tao Huang , Junjie Hu , Huali Yang , Shengze Hu , Jing Geng , Xinjia Ou
{"title":"Memory flow-controlled knowledge tracing with three stages","authors":"Tao Huang , Junjie Hu , Huali Yang , Shengze Hu , Jing Geng , Xinjia Ou","doi":"10.1016/j.neunet.2025.107292","DOIUrl":"10.1016/j.neunet.2025.107292","url":null,"abstract":"<div><div>Knowledge Tracing (KT), as a pivotal technology in intelligent education systems, analyzes students’ learning data to infer their knowledge acquisition and predict their future performance. Recent advancements in KT recognize the importance of memory laws on knowledge acquisition but neglect modeling the inherent structure of memory, which leads to the inconsistency between explicit student learning and implicit memory transformation. Therefore, to enhance the consistency, we propose a novel memory flow-controlled knowledge tracing with three stages (MFCKT). According to information processing theory, we deconstruct learning into: sensory registration, short-term encoding, and long-term memory retrieval stages. Specifically, to extract sensory memory, MFCKT maximizes the similarity between positive augmentation views of learning sequence representations through contrastive pre-training. Then, to transform sensory memory into short-term memory, MFCKT fuses relational and temporal properties of sensory memory through a dual-channel structure composed of attention and recurrent neural networks. Furthermore, for obtaining long-term memory, MFCKT designs a monotonic gating mechanism to compute weights of hidden memory states, and then performs read-write operations on the memory matrix. Finally, MFCKT combines long-term and short-term memory vectors to retrieve latent knowledge states for future performance prediction. Extensive experimental results on five real-world datasets verify the superiority and interpretability of MFCKT.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107292"},"PeriodicalIF":6.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143519826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-02-25DOI: 10.1016/j.neunet.2025.107294
Yidan Liu , Kai Jiang , Weiying Xie , Jiaqing Zhang , Yunsong Li , Leyuan Fang
{"title":"Hyperspectral anomaly detection with self-supervised anomaly prior","authors":"Yidan Liu , Kai Jiang , Weiying Xie , Jiaqing Zhang , Yunsong Li , Leyuan Fang","doi":"10.1016/j.neunet.2025.107294","DOIUrl":"10.1016/j.neunet.2025.107294","url":null,"abstract":"<div><div>Hyperspectral anomaly detection (HAD) can identify and locate the targets without any known information and is widely applied in Earth observation and military fields. The majority of existing HAD methods use the low-rank representation (LRR) model to separate the background and anomaly through mathematical optimization, in which the anomaly is optimized with a handcrafted sparse prior (e.g., <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>2</mn><mo>,</mo><mn>1</mn></mrow></msub></math></span>-norm). However, this may not be ideal since they overlook the spatial structure present in anomalies and make the detection result largely dependent on manually set sparsity. To tackle these problems, we redefine the optimization criterion for the anomaly in the LRR model with a self-supervised network called self-supervised anomaly prior (SAP). This prior is obtained by the pretext task of self-supervised learning, which is customized to learn the characteristics of hyperspectral anomalies. Specifically, this pretext task is a classification task to distinguish the original hyperspectral image (HSI) and the pseudo-anomaly HSI, where the pseudo-anomaly is generated from the original HSI and designed as a prism with arbitrary polygon bases and arbitrary spectral bands. In addition, a dual-purified strategy is proposed to provide a more refined background representation with an enriched background dictionary, facilitating the separation of anomalies from complex backgrounds. Extensive experiments on various hyperspectral datasets demonstrate that the proposed SAP offers a more accurate and interpretable solution than other advanced HAD methods.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107294"},"PeriodicalIF":6.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143509252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-02-25DOI: 10.1016/j.neunet.2025.107259
Yanglei Gan, Qiao Liu, Run Lin, Tian Lan, Yuxiang Cai, Xueyi Liu, Changlin Li, Yan Liu
{"title":"Exploiting instance-label dynamics through reciprocal anchored contrastive learning for few-shot relation extraction","authors":"Yanglei Gan, Qiao Liu, Run Lin, Tian Lan, Yuxiang Cai, Xueyi Liu, Changlin Li, Yan Liu","doi":"10.1016/j.neunet.2025.107259","DOIUrl":"10.1016/j.neunet.2025.107259","url":null,"abstract":"<div><div>In the domain of Few-shot Relation Extraction (FSRE), the primary objective is to distill relational facts from limited labeled datasets. This task has recently witnessed significant advancements through the integration of Pre-trained Language Models (PLMs) within a supervised contrastive learning schema, which effectively leverages the dynamics between instance and label information. Despite these advancements, the comprehensive utilization of extensive instance-label pairs, aimed at facilitating the extraction of semantically rich representations within this paradigm, has yet to be fully harnessed. To bridge this gap, we introduce a <u>R</u>eciprocal <u>A</u>nchored <u>C</u>ontrastive <u>L</u>earning framework (RACL) for few-shot relation extraction, which is predicated on the premise that instance-label pairs provide distinct yet inherently complementary insights into textual semantics. Specifically, RACL employs a symmetric contrastive objective that incorporates both instance-level and label-level contrastive losses, promoting a more integrated and unified representational space. This approach is engineered to effectively delineate the nuanced relationships between instance attributes and relational facts, while simultaneously optimizing information sharing across different perspectives within the same relations. Extensive experiments on the FSRE benchmark datasets demonstrate the superiority of our approach as compared to the state-of-the-art baselines. Further ablation studies on Zero-shot and None-of-the-above settings confirm its robustness and adaptability in practical applications.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107259"},"PeriodicalIF":6.0,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143519827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-02-22DOI: 10.1016/j.neunet.2025.107289
Yanfeng Jiang , Ning Sun , Xueshuo Xie , Fei Yang , Tao Li
{"title":"ADFQ-ViT: Activation-Distribution-Friendly post-training Quantization for Vision Transformers","authors":"Yanfeng Jiang , Ning Sun , Xueshuo Xie , Fei Yang , Tao Li","doi":"10.1016/j.neunet.2025.107289","DOIUrl":"10.1016/j.neunet.2025.107289","url":null,"abstract":"<div><div>Vision Transformers (ViTs) have exhibited exceptional performance across diverse computer vision tasks, while their substantial parameter size incurs significantly increased memory and computational demands, impeding effective inference on resource-constrained devices. Quantization has emerged as a promising solution to mitigate these challenges, yet existing methods still suffer from significant accuracy loss at low-bit. We attribute this issue to the distinctive distributions of post-LayerNorm and post-GELU activations within ViTs, rendering conventional hardware-friendly quantizers ineffective, particularly in low-bit scenarios. To address this issue, we propose a novel framework called Activation-Distribution-Friendly post-training Quantization for Vision Transformers, ADFQ-ViT. Concretely, we introduce the Per-Patch Outlier-aware Quantizer to tackle irregular outliers in post-LayerNorm activations. This quantizer refines the granularity of the uniform quantizer to a per-patch level while retaining a minimal subset of values exceeding a threshold at full-precision. To handle the non-uniform distributions of post-GELU activations between positive and negative regions, we design the Shift-Log2 Quantizer, which shifts all elements to the positive region and then applies log2 quantization. Moreover, we present the Attention-score enhanced Module-wise Optimization which adjusts the parameters of each quantizer by reconstructing errors to further mitigate quantization error. Extensive experiments demonstrate ADFQ-ViT provides significant improvements over various baselines in image classification, object detection, and instance segmentation tasks at 4-bit. Specifically, when quantizing the ViT-B model to 4-bit, we achieve a 5.17% improvement in Top-1 accuracy on the ImageNet dataset. Our code is available at: <span><span>https://github.com/llwx593/adfq-vit.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"186 ","pages":"Article 107289"},"PeriodicalIF":6.0,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143479118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-02-22DOI: 10.1016/j.neunet.2025.107283
Chuan Fu , Tianyuan Zhou , Tan Guo , Qikui Zhu , Fulin Luo , Bo Du
{"title":"CNN-Transformer and Channel-Spatial Attention based network for hyperspectral image classification with few samples","authors":"Chuan Fu , Tianyuan Zhou , Tan Guo , Qikui Zhu , Fulin Luo , Bo Du","doi":"10.1016/j.neunet.2025.107283","DOIUrl":"10.1016/j.neunet.2025.107283","url":null,"abstract":"<div><div>Hyperspectral image classification is an important foundational technology in the field of Earth observation and remote sensing. In recent years, deep learning has achieved a series of remarkable achievements in this area. These deep learning-based hyperspectral image classifications typically require a large number of annotated samples to train the models. However, obtaining a large number of accurate annotated hyperspectral images for high-altitude or remote areas is usually extremely difficult. In this paper, we propose a novel algorithm, CTA-net, for hyperspectral classification with a small number of samples. First, we proposed a sample expansion scheme to generate a large number of new samples to alleviate the problem of insufficient samples. On this basis, we introduced a novel hyperspectral classification network. The network first utilizes a module based on CNN-Transformer to extract blocks of hyperspectral images, where CNN focuses primarily on local features, while the Transformer module focuses mainly on non-local features. Furthermore, a simple channel-spatial attention module is adopted to further optimize the features. We conducted experiments on multiple hyperspectral image datasets, and the experiments verified the effectiveness of our CTA-net.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"186 ","pages":"Article 107283"},"PeriodicalIF":6.0,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143479112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}