Neural Networks最新文献

筛选
英文 中文
LUNETR: Language-Infused UNETR for precise pancreatic tumor segmentation in 3D medical image
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-03-15 DOI: 10.1016/j.neunet.2025.107414
Ziyang Shi , Ruopeng Zhang , Xiajun Wei , Cheng Yu , Haojie Xie , Zhen Hu , Xili Chen , Yongzhong Zhang , Bin Xie , Zhengmao Luo , Wanxiang Peng , Xiaochun Xie , Fang Li , Xiaoli Long , Lin Li , Linan Hu
{"title":"LUNETR: Language-Infused UNETR for precise pancreatic tumor segmentation in 3D medical image","authors":"Ziyang Shi ,&nbsp;Ruopeng Zhang ,&nbsp;Xiajun Wei ,&nbsp;Cheng Yu ,&nbsp;Haojie Xie ,&nbsp;Zhen Hu ,&nbsp;Xili Chen ,&nbsp;Yongzhong Zhang ,&nbsp;Bin Xie ,&nbsp;Zhengmao Luo ,&nbsp;Wanxiang Peng ,&nbsp;Xiaochun Xie ,&nbsp;Fang Li ,&nbsp;Xiaoli Long ,&nbsp;Lin Li ,&nbsp;Linan Hu","doi":"10.1016/j.neunet.2025.107414","DOIUrl":"10.1016/j.neunet.2025.107414","url":null,"abstract":"<div><div>The identification of early micro-lesions and adjacent blood vessels in CT scans plays a pivotal role in the clinical diagnosis of pancreatic cancer, considering its aggressive nature and high fatality rate. Despite the widespread application of deep learning methods for this task, several challenges persist: (1) the complex background environment in abdominal CT scans complicates the accurate localization of potential micro-tumors; (2) the subtle contrast between micro-lesions within pancreatic tissue and the surrounding tissues makes it challenging for models to capture these features accurately; and (3) tumors that invade adjacent blood vessels pose significant barriers to surgical procedures. To address these challenges, we propose LUNETR (Language-Infused UNETR), an advanced multimodal encoder model that combines textual and image information for precise medical image segmentation. The integration of an autoencoding language model with cross-attention enabling our model to effectively leverage semantic associations between textual and image data, thereby facilitating precise localization of potential pancreatic micro-tumors. Additionally, we designed a Multi-scale Aggregation Attention (MSAA) module to comprehensively capture both spatial and channel characteristics of global multi-scale image data, enhancing the model's capacity to extract features from micro-lesions embedded within pancreatic tissue. Furthermore, in order to facilitate precise segmentation of pancreatic tumors and nearby blood vessels and address the scarcity of multimodal medical datasets, we collaborated with Zhuzhou Central Hospital to construct a multimodal dataset comprising CT images and corresponding pathology reports from 135 pancreatic cancer patients. Our experimental results surpass current state-of-the-art models, with the incorporation of the semantic encoder improving the average Dice score for pancreatic tumor segmentation by 2.23 %. For the Medical Segmentation Decathlon (MSD) liver and lung cancer datasets, our model achieved an average Dice score improvement of 4.31 % and 3.67 %, respectively, demonstrating the efficacy of the LUNETR.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107414"},"PeriodicalIF":6.0,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143674857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel deep transfer learning method based on explainable feature extraction and domain reconstruction
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-03-15 DOI: 10.1016/j.neunet.2025.107401
Li Wang, Lucong Zhang, Ling Feng, Tianyu Chen, Hongwu Qin
{"title":"A novel deep transfer learning method based on explainable feature extraction and domain reconstruction","authors":"Li Wang,&nbsp;Lucong Zhang,&nbsp;Ling Feng,&nbsp;Tianyu Chen,&nbsp;Hongwu Qin","doi":"10.1016/j.neunet.2025.107401","DOIUrl":"10.1016/j.neunet.2025.107401","url":null,"abstract":"<div><div>Although deep transfer learning has made significant progress, its “black-box” nature and unstable feature adaptation remain key obstacles. This study proposes a multi-stage deep transfer learning method, called XDTL, which combines explainable feature extraction and domain reconstruction to enhance the performance of target models. Specifically, the study first divides features into key and regular features through cross-validation and explainability analysis, then reconstructs the target domain using a seed replacement method based on key target samples, ultimately achieving deep transfer. Experimental results show that, compared to other methods, XDTL achieves an average improvement of 27.43 % in effectiveness, demonstrating superior performance and stronger explainability. This method offers new insights into addressing the explainability challenges in transfer learning and highlights its potential for broader applications across various tasks.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107401"},"PeriodicalIF":6.0,"publicationDate":"2025-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncertainty-Aware Graph Contrastive Fusion Network for multimodal physiological signal emotion recognition
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-03-14 DOI: 10.1016/j.neunet.2025.107363
Guangqiang Li, Ning Chen, Hongqing Zhu, Jing Li, Zhangyong Xu, Zhiying Zhu
{"title":"Uncertainty-Aware Graph Contrastive Fusion Network for multimodal physiological signal emotion recognition","authors":"Guangqiang Li,&nbsp;Ning Chen,&nbsp;Hongqing Zhu,&nbsp;Jing Li,&nbsp;Zhangyong Xu,&nbsp;Zhiying Zhu","doi":"10.1016/j.neunet.2025.107363","DOIUrl":"10.1016/j.neunet.2025.107363","url":null,"abstract":"<div><div>Graph Neural Networks (GNNs) have been widely adopted to mine topological patterns contained in physiological signals for emotion recognition. However, since physiological signals are non-stationary and susceptible to various noises, there exists inter-sensor connectivity uncertainty in each modality. Such intra-modal connectivity uncertainty may further lead to inter-modal semantic gap uncertainty, which will cause the unimodal bias problem and greatly affect the fusion effectiveness. While, such issue has never been fully considered in existing multimodal fusion models. To this end, we proposed an Uncertainty-Aware Graph Contrastive Fusion Network (UAGCFNet) to fuse multimodal physiological signals effectively for emotion recognition. Firstly, a probabilistic model-based Uncertainty-Aware Graph Convolutional Network (UAGCN), which can estimate and quantify the inter-sensor connectivity uncertainty, is constructed for each modality to extract its uncertainty-aware graph representation. Secondly, a Transitive Contrastive Fusion (TCF) module, which combines the Criss-Cross Attention (CCA)-based fusion mechanism and Transitive Contrastive Learning (TCL)-based calibration strategy organically, is designed to achieve effective fusion of multimodal graph representations by eliminating the unimodal bias problem resulting from the inter-modal semantic gap uncertainty. Extensive experimental results on DEAP, DREAMER, and MPED datasets under both subject-dependent and subject-independent scenarios demonstrate that (i) the proposed model outperforms State-Of-The-Art (SOTA) multimodal fusion models with fewer parameters and lower computational complexity; (ii) each key module and loss function contributes significantly to the performance enhancement of the proposed model; (iii) the proposed model can eliminate the unimodal bias problem effectively.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107363"},"PeriodicalIF":6.0,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143636722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic semantic-geometric guidance and structure transfer network for cross-scene hyperspectral image classification
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-03-14 DOI: 10.1016/j.neunet.2025.107374
Qin Xu , Shuke Wang , Jie Wei , Bo Jiang , Zhifu Tao , Bin Luo
{"title":"Dynamic semantic-geometric guidance and structure transfer network for cross-scene hyperspectral image classification","authors":"Qin Xu ,&nbsp;Shuke Wang ,&nbsp;Jie Wei ,&nbsp;Bo Jiang ,&nbsp;Zhifu Tao ,&nbsp;Bin Luo","doi":"10.1016/j.neunet.2025.107374","DOIUrl":"10.1016/j.neunet.2025.107374","url":null,"abstract":"<div><div>Recently, cross-scene hyperspectral image classification(HSIC) via domain adaptation is drawing increasing attention. However, most existing methods either directly align the source domain and target domain without fully mining of SD information, or perform the domain adaptation from semantic and structure aspects with simply characterization method which is sensitive to noise, resulting in the negative transfer and performance decline. To address these issues, in this paper, we propose a novel Dynamic Semantic-Geometric Guidance and Structure Transfer (DSGG-ST) network for cross-scene hyperspectral image classification task. The main aspects of DSGG-ST are twofold. On the one hand, the dynamic semantic-geometric guidance (DSGG) module is designed which consists of the semantic guidance component and geometric guidance component. The proposed DSGG module can align source and target domains under the dynamical guidance of the domain-invariance learning from the semantic and geometric perspectives. On the other hand, the graph attention learning-matching (GALM) module is developed for effectively transferring the structure information between the source domain and target domain. In this module, the graph attention network is adopted to encode the underlying complex structures, and the SeedGNN is exploited for efficient graph matching and alignment. Extensive experiments on three commonly used cross-scene HSI datasets demonstrate that the proposed DSGG-ST obtains a new SOTA performance on cross-scene HSIC, verifying the effectiveness of the proposed DSGG-ST.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107374"},"PeriodicalIF":6.0,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143631857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A spatial–spectral fusion convolutional transformer network with contextual multi-head self-attention for hyperspectral image classification
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-03-14 DOI: 10.1016/j.neunet.2025.107350
Wuli Wang , Qi Sun , Li Zhang , Peng Ren , Jianbu Wang , Guangbo Ren , Baodi Liu
{"title":"A spatial–spectral fusion convolutional transformer network with contextual multi-head self-attention for hyperspectral image classification","authors":"Wuli Wang ,&nbsp;Qi Sun ,&nbsp;Li Zhang ,&nbsp;Peng Ren ,&nbsp;Jianbu Wang ,&nbsp;Guangbo Ren ,&nbsp;Baodi Liu","doi":"10.1016/j.neunet.2025.107350","DOIUrl":"10.1016/j.neunet.2025.107350","url":null,"abstract":"<div><div>Convolutional neural networks (CNNs) can effectively extract local features, while Vision Transformer excels at capturing global features. Combining these two networks to enhance the classification performance of hyperspectral images (HSI) has garnered significant attention. However, most existing fusion methods introduce inductive biases for the Transformer by directly connecting convolutional modules and Transformer encoders for feature extraction but rarely enhance the Transformer’s ability to extract local contextual information through convolutional embedding. In this paper, we propose a spatial–spectral fusion convolutional Transformer method (SSFCT) with contextual multi-head self-attention (CMHSA) for HSI classification. Specifically, we first designed a local feature aggregation (LFA) module that utilizes a three-branch convolution architecture and attention layers to extract and enhance local spatial–spectral fusion features. Then, a novel CMHSA is built to extract interaction information of local contextual features through integrating static and dynamic local contextual representations from 3D convolution and attention mechanisms, and the CMHSA is integrated into the devised dual-branch spatial–spectral convolutional transformer (DSSCT) module to simultaneously capture global–local associations in both spatial and spectral domains. Finally, the attention feature fusion (AFF) module is proposed to fully obtain global–local spatial–spectral comprehensive features. Extensive experiments on five HSI datasets — Indian Pines, Salinas Valley, Houston2013, Botswana, and Yellow River Delta — outperform state-of-the-art methods, achieving overall accuracies of 98.03%, 99.68%, 98.65%, 97.97%, and 89.43%, respectively, showcasing its effectiveness for HSI classification.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107350"},"PeriodicalIF":6.0,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143637011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A neural approach to the Turing Test: The role of emotions
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-03-14 DOI: 10.1016/j.neunet.2025.107362
Rita Pizzi , Hao Quan , Matteo Matteucci , Simone Mentasti , Roberto Sassi
{"title":"A neural approach to the Turing Test: The role of emotions","authors":"Rita Pizzi ,&nbsp;Hao Quan ,&nbsp;Matteo Matteucci ,&nbsp;Simone Mentasti ,&nbsp;Roberto Sassi","doi":"10.1016/j.neunet.2025.107362","DOIUrl":"10.1016/j.neunet.2025.107362","url":null,"abstract":"<div><div>As is well known, the Turing Test proposes the possibility of distinguishing the behavior of a machine from that of a human being through an experimental session. The Turing Test assesses whether a person asking questions to two different entities, can tell from their answers which of them is the human being and which is the machine. With the progress of Artificial Intelligence, the number of contexts in which the capacities of response of a machine will be indistinguishable from those of a human being is expected to increase rapidly. In order to configure a Turing Test in which it is possible to distinguish human behavior from machine behavior independently from the advances of Artificial Intelligence, at least in the short-medium term, it would be important to base it not on the differences between man and machine in terms of performance and dialogue capacity, but on some specific characteristic of the human mind that cannot be reproduced by the machine even in principle. We studied a new kind of test based on the hypothesis that such characteristic of the human mind exists and can be made experimentally evident. This peculiar characteristic is the emotional content of human cognition and, more specifically, its link with memory enhancement. To validate this hypothesis we recorded the EEG signals of 39 subjects that underwent a specific test and analyzed their signals with a neural network able to label similar signal patterns with similar binary codes. The results showed that, with a statistically significant difference, the test participants more easily recognized images associated in the past with an emotional reaction than those not associated with such a reaction. This distinction in our view is not accessible to a software system, even AI-based, and a Turing Test based on this feature of the mind may make distinguishable human versus machine responses.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107362"},"PeriodicalIF":6.0,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143679995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An improved Artificial Protozoa Optimizer for CNN architecture optimization
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-03-13 DOI: 10.1016/j.neunet.2025.107368
Xiaofeng Xie , Yuelin Gao , Yuming Zhang
{"title":"An improved Artificial Protozoa Optimizer for CNN architecture optimization","authors":"Xiaofeng Xie ,&nbsp;Yuelin Gao ,&nbsp;Yuming Zhang","doi":"10.1016/j.neunet.2025.107368","DOIUrl":"10.1016/j.neunet.2025.107368","url":null,"abstract":"<div><div>In this paper, we propose a novel neural architecture search (NAS) method called MAPOCNN, which leverages an enhanced version of the Artificial Protozoa Optimizer (APO) to optimize the architecture of Convolutional Neural Networks (CNNs). The APO is known for its rapid convergence, high stability, and minimal parameter involvement. To further improve its performance, we introduce MAPO (Modified Artificial Protozoa Optimizer), which incorporates the phototaxis behavior of protozoa. This addition helps mitigate the risk of premature convergence, allowing the algorithm to explore a broader range of possible CNN architectures and ultimately identify more optimal solutions. Through rigorous experimentation on benchmark datasets, including Rectangle and Mnist-random, we demonstrate that MAPOCNN not only achieves faster convergence times but also performs competitively when compared to other state-of-the-art NAS algorithms. The results highlight the effectiveness of MAPOCNN in efficiently discovering CNN architectures that outperform existing methods in terms of both speed and accuracy. This work presents a promising direction for optimizing deep learning architectures using biologically inspired optimization techniques.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107368"},"PeriodicalIF":6.0,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143644931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic information-based attention mapping network for few-shot knowledge graph completion
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-03-13 DOI: 10.1016/j.neunet.2025.107366
Fan Guo , Xiangmao Chang , Yunqi Guo , Guoliang Xing , Yunlong Zhao
{"title":"Semantic information-based attention mapping network for few-shot knowledge graph completion","authors":"Fan Guo ,&nbsp;Xiangmao Chang ,&nbsp;Yunqi Guo ,&nbsp;Guoliang Xing ,&nbsp;Yunlong Zhao","doi":"10.1016/j.neunet.2025.107366","DOIUrl":"10.1016/j.neunet.2025.107366","url":null,"abstract":"<div><div>Few-shot Knowledge Graph Completion (FKGC), an emerging technology capable of inferring new triples using only a few reference relation triples, has gained significant attention in recent years. However, existing FKGC methods primarily focus on structural information while failing to effectively utilize the textual semantic information inherent in triples. To address this limitation, we propose an innovative Semantic Information-based Attention Mapping Network (SI-AMN). This novel model significantly enhances knowledge graph completion accuracy through a unique dual-information fusion mechanism that effectively integrates both structural and textual semantic information. The core innovation of SI-AMN lies in its two key components: a semantic encoder for extracting high-quality textual features and an attention mapping network that learns semantic interactions between entity and relation types. Experimental results on benchmark datasets demonstrate SI-AMN’s superior performance, achieving a 40% improvement in prediction accuracy compared to state-of-the-art methods. Ablation studies further validate the effectiveness of each component in our proposed model. This research not only provides a novel solution for knowledge graph completion but also reveals the crucial value of semantic information in graph completion tasks, paving the way for future research directions in this field.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107366"},"PeriodicalIF":6.0,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143637720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-supervised non-negative matrix factorization with structure preserving for image clustering 用于图像聚类的半监督非负矩阵因式分解与结构保护
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-03-13 DOI: 10.1016/j.neunet.2025.107340
Wenjing Jing , Linzhang Lu , Weihua Ou
{"title":"Semi-supervised non-negative matrix factorization with structure preserving for image clustering","authors":"Wenjing Jing ,&nbsp;Linzhang Lu ,&nbsp;Weihua Ou","doi":"10.1016/j.neunet.2025.107340","DOIUrl":"10.1016/j.neunet.2025.107340","url":null,"abstract":"<div><div>Semi-supervised learning methods have wide applications thanks to the reasonable utilization for a part of label information of data. In recent years, non-negative matrix factorization (NMF) has received considerable attention because of its interpretability and practicality. Based on the advantages of semi-supervised learning and NMF, many semi-supervised NMF methods have been presented. However, these existing semi-supervised NMF methods construct a label matrix only containing elements 1 and 0 to represent the labeled data and further construct a label regularization, which neglects an intrinsic structure of NMF. To address the deficiency, in this paper, we propose a novel semi-supervised NMF method with structure preserving. Specifically, we first construct a new label matrix with weights and further construct a label constraint regularizer to both utilize the label information and maintain the intrinsic structure of NMF. Then, based on the label constraint regularizer, the basis images of labeled data are extracted for monitoring and modifying the basis images learning of all data by establishing a basis regularizer. Finally, incorporating the label constraint regularizer and the basis regularizer into NMF, we propose a new semi-supervised NMF method. To solve the optimization problem, a multiplicative updating algorithm is developed. The proposed method is applied to image clustering to test its performance. Experimental results on eight data sets demonstrate the effectiveness of the proposed method in contrast with state-of-the-art unsupervised and semi-supervised algorithms.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107340"},"PeriodicalIF":6.0,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143637708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive decoupling-fusion in Siamese network for image classification
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-03-13 DOI: 10.1016/j.neunet.2025.107346
Xi Yang, Pai Peng, Danyang Li, Yinghao Ye, Xiaohuan Lu
{"title":"Adaptive decoupling-fusion in Siamese network for image classification","authors":"Xi Yang,&nbsp;Pai Peng,&nbsp;Danyang Li,&nbsp;Yinghao Ye,&nbsp;Xiaohuan Lu","doi":"10.1016/j.neunet.2025.107346","DOIUrl":"10.1016/j.neunet.2025.107346","url":null,"abstract":"<div><div>Convolutional neural networks (CNNs) are highly regarded for their ability to extract semantic information from visual inputs. However, this capability often leads to the inadvertent loss of important visual details. In this paper, we introduce an Adaptive Decoupling Fusion (ADF) designed to preserve these valuable visual details and integrate seamlessly with existing hierarchical models. Our approach emphasizes retaining and leveraging appearance information from the network’s shallow layers to enhance semantic understanding. We first decouple the appearance information from one branch of a Siamese Network and embed it into the deep feature space of the other branch. This facilitates a synergistic interaction: one branch supplies appearance information that benefits semantic understanding, while the other integrates this information into the semantic space. Traditional Siamese Networks typically use shared weights, which constrains the diversity of features that can be learned. To address this, we propose a differentiated collaborative learning where both branches receive the same input but are trained with cross-entropy loss, allowing them to have distinct weights. This enhances the network’s adaptability to specific tasks. To further optimize the decoupling and fusion, we introduce a Mapper module featuring depthwise separable convolution and a gated fusion mechanism. This module regulates the information flow between branches, balancing appearance and semantic information. Under fully self-supervised conditions, utilizing only minimal data augmentation, we achieve a top-1 accuracy of 81.11% on the ImageNet-1k dataset using ADF-ResNeXt-101.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107346"},"PeriodicalIF":6.0,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143637017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信