Neural NetworksPub Date : 2025-05-29DOI: 10.1016/j.neunet.2025.107680
Yuanhong Tang , Shanshan Jia , Tiejun Huang , Zhaofei Yu , Jian K. Liu
{"title":"Corrigendum to “Implementing feature binding through dendritic networks of a single neuron” [Neural Networks(2025) 107555]","authors":"Yuanhong Tang , Shanshan Jia , Tiejun Huang , Zhaofei Yu , Jian K. Liu","doi":"10.1016/j.neunet.2025.107680","DOIUrl":"10.1016/j.neunet.2025.107680","url":null,"abstract":"","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107680"},"PeriodicalIF":6.0,"publicationDate":"2025-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144169817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-05-28DOI: 10.1016/j.neunet.2025.107620
Dawei Song , Yuan Yuan , Xuelong Li
{"title":"Potential region attention network for RGB-D salient object detection","authors":"Dawei Song , Yuan Yuan , Xuelong Li","doi":"10.1016/j.neunet.2025.107620","DOIUrl":"10.1016/j.neunet.2025.107620","url":null,"abstract":"<div><div>Many encouraging investigations have already been conducted on RGB-D salient object detection (SOD). However, most of these methods are limited in mining single-modal features and have not fully utilized the appropriate complementarity of cross-modal features. To alleviate the issues, this study designs a potential region attention network (PRANet) for RGB-D SOD. Specifically, the PRANet adopts Swin Transformer as its backbone to efficiently obtain two-stream features. Besides, a potential multi-scale attention module (PMAM) is equipped at the highest level of the encoder, which is beneficial for mining intra-modal information and enhancing feature expression. More importantly, a potential region attention module (PRAM) is designed to properly utilize the complementarity of cross-modal information, which adopts a potential region attention to guide two-stream feature fusion. In addition, by refining and correcting cross-layer features, a feature refinement fusion module (FRFM) is designed to strengthen the cross-layer information transmission between the encoder and decoder. Finally, the multi-side supervision is used during the training phase. Sufficient experimental results on 6 RGB-D SOD datasets indicate that our PRANet has achieved outstanding performance and is superior to 15 representative methods.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107620"},"PeriodicalIF":6.0,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144178708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-05-27DOI: 10.1016/j.neunet.2025.107610
Keisuke Kawano , Takuro Kutsuna , Keisuke Sano
{"title":"Minimal sufficient views: A DNN model making predictions with more evidence has higher accuracy","authors":"Keisuke Kawano , Takuro Kutsuna , Keisuke Sano","doi":"10.1016/j.neunet.2025.107610","DOIUrl":"10.1016/j.neunet.2025.107610","url":null,"abstract":"<div><div>Deep neural networks (DNNs) exhibit high performance in image recognition; however, the reasons for their strong generalization abilities remain unclear. A plausible hypothesis is that DNNs achieve robust and accurate predictions by identifying multiple pieces of evidence from images. Thus, to test this hypothesis, this study proposed minimal sufficient views (MSVs). MSVs is defined as a set of minimal regions within an input image that are sufficient to preserve the prediction of DNNs, thus representing the evidence discovered by the DNN. We empirically demonstrated a strong correlation between the number of MSVs (i.e., the number of pieces of evidence) and the generalization performance of the DNN models. Remarkably, this correlation was found to hold within a single DNN as well as between different DNNs, including convolutional and transformer models. This suggested that a DNN model that makes its prediction based on more evidence has a higher generalization performance. We proposed a metric based on MSVs for DNN model selection that did not require label information. Consequently, we empirically showed that the proposed metric was less dependent on the degree of overfitting, rendering it a more reliable indicator of model performance than existing metrics, such as average confidence.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107610"},"PeriodicalIF":6.0,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144169812","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-05-27DOI: 10.1016/j.neunet.2025.107613
Huaping Zhou , Tao Wu , Kelei Sun , Jin Wu , Bin Deng , Xueseng Zhang
{"title":"HLGNet: High-Light Guided Network for low-light instance segmentation with spatial-frequency domain enhancement","authors":"Huaping Zhou , Tao Wu , Kelei Sun , Jin Wu , Bin Deng , Xueseng Zhang","doi":"10.1016/j.neunet.2025.107613","DOIUrl":"10.1016/j.neunet.2025.107613","url":null,"abstract":"<div><div>Instance segmentation models generally perform well under typical lighting conditions but struggle in low-light environments due to insufficient fine-grained detail. To address this, frequency domain enhancement has shown promise. However, the lack of spatial domain processing in existing frequency domain based methods often results in poor boundary delineation and inadequate local perception. To address these challenges, we propose HLGNet (High-Light Guided Network). By leveraging high-light image masks, our approach integrates enhancements in both the frequency and spatial domains, thereby improving the feature representation of low-light images. Specifically, we propose the SPE (Spatial-Frequency Enhancement) Block, which effectively combines and complements local spatial features with global frequency domain information. Additionally, we design the DAF (Dynamic Affine Fusion) module to inject frequency domain information into semantically significant features, thereby enhancing the model’s ability to capture both detailed target information and global semantic context. Finally, we propose the HLG Decoder, which dynamically adjusts the attention distribution by utilizing mutual information and entropy, guided by high-light image masks. This ensures improved focus on both local details and global semantics. Extensive quantitative and qualitative evaluations on two widely used low-light instance segmentation datasets demonstrate that HLGNet outperforms current state-of-the-art methods.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107613"},"PeriodicalIF":6.0,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144169816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-05-27DOI: 10.1016/j.neunet.2025.107606
Qiyue Li , Xuemei Xie , Jin Zhang , Guangming Shi
{"title":"Recognizing human–object interactions in videos with the supervision of natural language","authors":"Qiyue Li , Xuemei Xie , Jin Zhang , Guangming Shi","doi":"10.1016/j.neunet.2025.107606","DOIUrl":"10.1016/j.neunet.2025.107606","url":null,"abstract":"<div><div>Existing models for recognizing human–object interaction (HOI) in videos mainly rely on visual information for reasoning and generally treat recognition tasks as traditional multi-classification problems, where labels are represented by numbers. This supervised learning method discards semantic information in the labels and ignores advanced semantic relationships between actual categories. In fact, natural language contains a wealth of linguistic knowledge that humans have distilled about human–object interaction, and the category text contains a large amount of semantic relationships between texts. Therefore, this paper introduces human–object interaction category text features as labels and proposes a natural language supervised learning model for human–object interaction by using natural language to supervise visual feature learning to enhance visual feature expression capability. The model applies contrastive learning paradigm to human–object interaction recognition, using an image–text paired pre-training model to obtain individual image features and interaction category text features, and then using a spatial–temporal mixed module to obtain high semantic combination-based human–object interaction spatial–temporal features. Finally, the obtained visual interaction features and category text features are compared for similarity to infer the correct video human–object interaction category. The model aims to explore the semantic information in human–object interaction category label text and use a large number of image–text paired samples trained by a multi-modal pre-training model to obtain visual and textual correspondence to enhance the ability of video human–object interaction recognition. Experimental results on two human–object interaction datasets demonstrate that our method achieves the state-of-the-art performance, e.g., 93.6% and 93.1% F1 Score for Sub-activity and Affordance on CAD-120 dataset.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107606"},"PeriodicalIF":6.0,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144169814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-05-27DOI: 10.1016/j.neunet.2025.107618
Peng He , Jun Yu , Liuxue Ju , Fang Gao
{"title":"Fine-grained hierarchical dynamics for image harmonization","authors":"Peng He , Jun Yu , Liuxue Ju , Fang Gao","doi":"10.1016/j.neunet.2025.107618","DOIUrl":"10.1016/j.neunet.2025.107618","url":null,"abstract":"<div><div>Image harmonization aims to generate visually consistent composite images by ensuring compatibility between the foreground and background. Existing image harmonization strategies based on the global transformation emphasize using background information for foreground normalization, potentially overlooking significant variations in appearance among regions within various scenes. Simultaneously, the coherence of local information plays a critical role in generating visually consistent images as well. To address these issues, we propose the Hierarchical Dynamics Appearance Translation (HDAT) framework, enabling a seamless transition of features and parameters from local to global views in the network and adaptive adjustments of foreground appearance based on corresponding background information. Specifically, we introduce the dynamic region-aware convolution and fine-grained mixed attention mechanism to promote the harmonious coordination of global and local details. Among them, the dynamic region-aware convolution guided by foreground masks is utilized to learn adaptive representations and correlations of foreground and background elements based on global dynamics. Meanwhile, the fine-grained mixed attention dynamically adjusts features at different channels and positions to achieve local adaptations. Furthermore, we integrate a novel multi-scale feature calibration strategy to ensure information consistency across varying scales. Extensive experiments demonstrate that our HDAT framework significantly reduces the number of network parameters while outperforming existing methods both qualitatively and quantitatively.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107618"},"PeriodicalIF":6.0,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144178711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-05-27DOI: 10.1016/j.neunet.2025.107612
Ming Gu , Gaoming Yang , Zhuonan Zheng , Meihan Liu , Haishuai Wang , Jiawei Chen , Sheng Zhou , Jiajun Bu
{"title":"Frequency Self-Adaptation Graph Neural Network for Unsupervised Graph Anomaly Detection","authors":"Ming Gu , Gaoming Yang , Zhuonan Zheng , Meihan Liu , Haishuai Wang , Jiawei Chen , Sheng Zhou , Jiajun Bu","doi":"10.1016/j.neunet.2025.107612","DOIUrl":"10.1016/j.neunet.2025.107612","url":null,"abstract":"<div><div>Unsupervised Graph Anomaly Detection (UGAD) seeks to identify abnormal patterns in graphs without relying on labeled data. Among existing UGAD methods, Graph Neural Networks (GNNs) have played a critical role in learning effective representation for detection by filtering low-frequency graph signals. However, the presence of anomalies can shift the frequency band of graph signals toward higher frequencies, thereby violating the fundamental assumptions underlying GNNs and anomaly detection frameworks. To address this challenge, the design of novel graph filters has garnered significant attention, with recent approaches leveraging anomaly labels in a semi-supervised manner. Nonetheless, the absence of anomaly labels in real-world scenarios has rendered these methods impractical, leaving the question of how to design effective filters in an unsupervised manner largely unexplored. To bridge this gap, we propose a novel <strong>F</strong>requency Self-<strong>A</strong>daptation <strong>G</strong>raph Neural Network for Unsupervised Graph <strong>A</strong>nomaly <strong>D</strong>etection (<strong>FAGAD</strong>). Specifically, FAGAD adaptively fuses signals across multiple frequency bands using full-pass signals as a reference. It is optimized via a self-supervised learning approach, enabling the generation of effective representations for unsupervised graph anomaly detection. Experimental results demonstrate that FAGAD achieves state-of-the-art performance on both artificially injected datasets and real-world datasets. The code and datasets are publicly available at <span><span>https://github.com/eaglelab-zju/FAGAD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107612"},"PeriodicalIF":6.0,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144178709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-05-27DOI: 10.1016/j.neunet.2025.107608
Dunwei Tu , Huiyu Yi , Tieyi Zhang , Ruotong Li , Furao Shen , Jian Zhao
{"title":"Embedding Space Allocation with Angle-Norm Joint Classifiers for few-shot class-incremental learning","authors":"Dunwei Tu , Huiyu Yi , Tieyi Zhang , Ruotong Li , Furao Shen , Jian Zhao","doi":"10.1016/j.neunet.2025.107608","DOIUrl":"10.1016/j.neunet.2025.107608","url":null,"abstract":"<div><div>Few-shot class-incremental learning (FSCIL) aims to continually learn new classes from only a few samples without forgetting previous ones, requiring intelligent agents to adapt to dynamic environments. FSCIL combines the characteristics and challenges of class-incremental learning and few-shot learning: (i) Current classes occupy the entire feature space, which is detrimental to learning new classes. (ii) The small number of samples in incremental rounds is insufficient for fully training. In existing mainstream virtual class methods, to address the challenge (i), they attempt to use virtual classes as placeholders. However, new classes may not necessarily align with the virtual classes. For challenge (ii), they replace trainable fully connected layers with Nearest Class Mean (NCM) classifiers based on cosine similarity, but NCM classifiers do not account for sample imbalance issues. To address these issues in previous methods, we propose the class-center guided embedding Space Allocation with Angle-Norm joint classifiers (SAAN) learning framework, which provides balanced space for all classes and leverages norm differences caused by sample imbalance to enhance classification criteria. Specifically, for challenge (i), SAAN divides the feature space into multiple subspaces and allocates a dedicated subspace for each session by guiding samples with the pre-set category centers. For challenge (ii), SAAN establishes a norm distribution for each class and generates angle-norm joint logits. Experiments demonstrate that SAAN can achieve state-of-the-art performance and it can be directly embedded into other SOTA methods as a plug-in, further enhancing their performance.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107608"},"PeriodicalIF":6.0,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144178710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-05-27DOI: 10.1016/j.neunet.2025.107627
Xuechun Hu , Yu Xia , Zsófia Lendek , Jinde Cao , Radu-Emil Precup
{"title":"A novel dynamic prescribed performance fuzzy-neural backstepping control for PMSM under step load","authors":"Xuechun Hu , Yu Xia , Zsófia Lendek , Jinde Cao , Radu-Emil Precup","doi":"10.1016/j.neunet.2025.107627","DOIUrl":"10.1016/j.neunet.2025.107627","url":null,"abstract":"<div><div>In order to meet the performance requirements of permanent magnet synchronous motor (PMSM) systems with time-varying model parameters and input constraints under step load, this paper proposes a dynamic prescribed performance fuzzy-neural backstepping control approach. Firstly, a novel finite-time asymmetric dynamic prescribed performance function (FADPPF) is proposed to tackle the issues of exceeding predefined error, control singularity, and system instability that arise in the traditional prescribed performance function under load changes. To address model accuracy degradation and control quality deterioration caused by nonlinear time-varying parameters and input constraints in the PMSM system, a backstepping controller is designed by combining the speed function (SF), fuzzy neural network (FNN), and the proposed FADPPF. The FNN approximates nonlinear uncertain functions in the system model; the SF, as an error amplification mechanism, works together with FADPPF to ensure the transient and steady-state performance of the system. The stability of the devised control strategy is proved using Lyapunov analysis. Finally, simulation results demonstrate the dynamic self-adjusting ability and effectiveness of FADPPF under step load. In addition, the feasibility and superiority of the proposed control scheme are validated.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107627"},"PeriodicalIF":6.0,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144169815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DiverseReID: Towards generalizable person re-identification via Dynamic Style Hallucination and decoupled domain experts","authors":"Jieru Jia, Huidi Xie, Qin Huang, Yantao Song, Peng Wu","doi":"10.1016/j.neunet.2025.107602","DOIUrl":"10.1016/j.neunet.2025.107602","url":null,"abstract":"<div><div>Person re-identification (re-ID) models often fail to generalize well when deployed to other camera networks with domain shift. A classical domain generalization (DG) solution is to enhance the diversity of source data so that a model can learn more domain-invariant, and hence generalizable representations. Existing methods typically mix images from different domains in a mini-batch to generate novel styles, but the mixing coefficient sampled from predefined Beta distribution requires careful manual tuning and may render sub-optimal performance. To this end, we propose a plug-and-play Dynamic Style Hallucination (DSH) module that adaptively adjusts the mixing weights based on the style distribution discrepancy between image pairs, which is dynamically measured with the reciprocal of Wasserstein distances. This approach not only reduces the tedious manual tuning of parameters but also significantly enriches style diversity by expanding the perturbation space to the utmost. In addition, to promote inter-domain diversity, we devise a Domain Experts Decoupling (DED) loss, which constrains features from one domain to go towards the orthogonal direction against features from other domains. The proposed approach, dubbed DiverseReID, is parameter-free and computationally efficient. Without bells and whistles, it outperforms the state-of-the-art on various DG re-ID benchmarks. Experiments verify that style diversity, not just the size of the training data, is crucial for enhancing generalization.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"189 ","pages":"Article 107602"},"PeriodicalIF":6.0,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144135130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}