Neural NetworksPub Date : 2025-05-26DOI: 10.1016/j.neunet.2025.107604
Shouwen Wang , Qian Wan , Zihan Zhang , Zhigang Zeng
{"title":"Prompt-guided consistency learning for multi-label classification with incomplete labels","authors":"Shouwen Wang , Qian Wan , Zihan Zhang , Zhigang Zeng","doi":"10.1016/j.neunet.2025.107604","DOIUrl":"10.1016/j.neunet.2025.107604","url":null,"abstract":"<div><div>Addressing insufficient supervision and improving model generalization are essential for multi-label classification with incomplete annotations, <em>i.e.</em>, partial and single positive labels. Recent studies incorporate pseudo-labels to provide additional supervision and enhance model generalization. However, the noise in pseudo-labels generated by the model tends to accumulate, resulting in confirmation bias during training. Self-correction methods, commonly used approaches for mitigating confirmation bias, rely on model predictions but remain susceptible to confirmation bias caused by visual confusion, including both visual ambiguity and similarity. To reduce visual confusion, we propose a prompt-guided consistency learning (PGCL) framework designed for two incomplete labeling settings. Specifically, we introduce an intra-category supervised contrastive loss, which imposes consistency constraints on reliable positive class samples in the feature space of each category, rather than across the feature space of all categories, as in traditional inter-category supervised contrastive loss. Building on this, the distinction between true positive and visual confusion samples for each category is enhanced through label-level contrasting of the same category. Additionally, we develop a class-specific semantic decoupling module that leverages CLIP’s strong vision-language alignment capability, since the proposed contrastive loss requires high-quality label-level representations as contrastive samples. Extensive experimental results on multiple datasets demonstrate that our method can effectively address the problems of two incomplete labeling settings and achieve state-of-the-art performance.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107604"},"PeriodicalIF":6.0,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144211831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-05-26DOI: 10.1016/j.neunet.2025.107681
Danfeng Zhao, Yanhao Chen, Wei Song, Qi He
{"title":"Cross-view self-supervised heterogeneous graph representation learning","authors":"Danfeng Zhao, Yanhao Chen, Wei Song, Qi He","doi":"10.1016/j.neunet.2025.107681","DOIUrl":"10.1016/j.neunet.2025.107681","url":null,"abstract":"<div><div>Heterogeneous graph neural networks (HGNNs) often face challenges in efficiently integrating information from multiple views, which hinders their ability to fully leverage complex data structures. To overcome this problem, we present an improved graph-level cross-attention mechanism specifically designed to enhance multi-view integration and improve the model's expressiveness in heterogeneous networks. By incorporating random walks, the Katz index, and Transformers, the model captures higher-order semantic relationships between nodes within the meta-path view. Node context information is extracted by decomposing the network and applying the attention mechanism within the network schema view. The improved graph-level cross-attention in the cross-view context adaptively fuses features from both views. Furthermore, a contrastive loss function is employed to select positive samples based on the local connection strength and global centrality of nodes, enhancing the model's robustness. The suggested self-supervised model performs exceptionally well in node classification and clustering tasks, according to experimental data, demonstrating the effectiveness of our method.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107681"},"PeriodicalIF":6.0,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144185015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-05-26DOI: 10.1016/j.neunet.2025.107684
Chunling Fan, Yuebin Song, Xiaoqian Mao
{"title":"A classification method of motor imagery based on brain functional networks by fusing PLV and ECSP","authors":"Chunling Fan, Yuebin Song, Xiaoqian Mao","doi":"10.1016/j.neunet.2025.107684","DOIUrl":"10.1016/j.neunet.2025.107684","url":null,"abstract":"<div><div>In order to enhance the decoding ability of brain states and evaluate the functional connection changes of relevant nodes in brain regions during motor imagery (MI), this paper proposes a brain functional network construction method which fuses edge features and node features. And we use deep learning methods to realize MI classification of left and right hand grasping tasks. Firstly, we use phase locking value (PLV) to extract edge features and input a weighted PLV to enhanced common space pattern (ECSP) to extract node features. Then, we fuse edge features and node features to construct a novel brain functional network. Finally, we construct an attention and multi-scale feature convolutional neural network (AMSF-CNN) to validate our method. The performance indicators of the brain functional network on the SHU_Dataset in the corresponding brain region will increase and be higher than those in the contralateral brain region when imagining one hand grasping. The average accuracy of our method reaches 79.65 %, which has a 25.85 % improvement compared to the accuracy provided by SHU_Dataset. By comparing with other methods on SHU_Dataset and BCI IV 2a Dataset, the average accuracies achieved by our method outperform other references. Therefore, our method provides theoretical support for exploring the working mechanism of the human brain during MI.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107684"},"PeriodicalIF":6.0,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144185016","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-05-25DOI: 10.1016/j.neunet.2025.107603
Minghui Liao, Guojia Wan, Wenbin Hu, Bo Du
{"title":"Building connectome analysis tools with representation learning on neuronal skeleton and circuit topology","authors":"Minghui Liao, Guojia Wan, Wenbin Hu, Bo Du","doi":"10.1016/j.neunet.2025.107603","DOIUrl":"10.1016/j.neunet.2025.107603","url":null,"abstract":"<div><div>Analyzing connectome plays a significant role in the investigation of neurological diseases and brain research. However, the efficiency of utilizing anatomical, physiological, or molecular characteristics of neurons is relatively low and costly. With the advancements in volume electron microscopy(VEM) and analysis techniques for brain tissue, we are able to obtain whole-brain connectome consisting neuronal high-resolution morphology and connectivity information. Nevertheless, few tools are built based on such data for automated connectome analysis. In this paper, we introduce a connectome analysis tool based on a representation learning model termed NeuNet. NeuNet consists of three key components: Connectome Encoder, Skeleton Encoder, and Readout Layer, which together integrate information pertaining to neuronal connectivity and morphology. Furthermore, we reprocess and release a brain neuron reconstruction dataset from a <em>Drosophila</em> Nerve Cord VEM data. We apply the proposed tool to tasks related to connectome analysis, including neuron classification, brain circuit layout, neuron retrieval and neuron morphology description, and the experiments demonstrate the effectiveness of our tool. We will soon release our code and data on <span><span>https://github.com/WHUminghui/ConnectomeAnalysisTool</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107603"},"PeriodicalIF":6.0,"publicationDate":"2025-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144204856","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DiverseReID: Towards generalizable person re-identification via Dynamic Style Hallucination and decoupled domain experts","authors":"Jieru Jia, Huidi Xie, Qin Huang, Yantao Song, Peng Wu","doi":"10.1016/j.neunet.2025.107602","DOIUrl":"10.1016/j.neunet.2025.107602","url":null,"abstract":"<div><div>Person re-identification (re-ID) models often fail to generalize well when deployed to other camera networks with domain shift. A classical domain generalization (DG) solution is to enhance the diversity of source data so that a model can learn more domain-invariant, and hence generalizable representations. Existing methods typically mix images from different domains in a mini-batch to generate novel styles, but the mixing coefficient sampled from predefined Beta distribution requires careful manual tuning and may render sub-optimal performance. To this end, we propose a plug-and-play Dynamic Style Hallucination (DSH) module that adaptively adjusts the mixing weights based on the style distribution discrepancy between image pairs, which is dynamically measured with the reciprocal of Wasserstein distances. This approach not only reduces the tedious manual tuning of parameters but also significantly enriches style diversity by expanding the perturbation space to the utmost. In addition, to promote inter-domain diversity, we devise a Domain Experts Decoupling (DED) loss, which constrains features from one domain to go towards the orthogonal direction against features from other domains. The proposed approach, dubbed DiverseReID, is parameter-free and computationally efficient. Without bells and whistles, it outperforms the state-of-the-art on various DG re-ID benchmarks. Experiments verify that style diversity, not just the size of the training data, is crucial for enhancing generalization.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"189 ","pages":"Article 107602"},"PeriodicalIF":6.0,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144135130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-05-24DOI: 10.1016/j.neunet.2025.107601
Zhijian Zhuo , Bilian Chen
{"title":"Overlapping community detection via Layer-Jaccard similarity incorporated nonnegative matrix factorization","authors":"Zhijian Zhuo , Bilian Chen","doi":"10.1016/j.neunet.2025.107601","DOIUrl":"10.1016/j.neunet.2025.107601","url":null,"abstract":"<div><div>As information modernization progresses, the connections between entities become more elaborate, forming more intricate networks. Consequently, the emphasis on community detection has transitioned from discerning disjoint communities towards the identification of overlapping communities. A variety of algorithms based on the sparse adjacency matrix, which are sensitive to edge connections, are suitable for detecting edge-sparse areas between overlapping communities but lack the ability to detect edge-dense areas within the overlapping communities. Additionally, most algorithms do not take into account multihop information. To mitigate the aforementioned limitations, we propose an innovative approach termed Layer-Jaccard similarity incorporated nonnegative matrix factorization (LJSINMF), which utilizes both the adjacency matrix and the Layer-Jaccard similarity matrix. Our method initially employs a newly proposed Onion-shell method to decompose the network into layers. Subsequently, the layer of each node is used to construct a Layer-Jaccard similarity matrix, which facilitates the identification of edge-dense areas within the overlapping communities and serves as a general approach for enhancing other nonnegative matrix factorization-based algorithms. Ultimately, we integrate the adjacency matrix and the Layer-Jaccard similarity matrix into the nonnegative matrix factorization framework to determine the node-community membership matrix. Moreover, integrating the Layer-Jaccard similarity matrix into other algorithms is a promising approach to enhance their performance. Comprehensive experiments have been conducted on real-world networks and the results substantiate that the LJSINMF algorithm outperforms most state-of-the-art baseline methods in terms of three evaluation metrics.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"189 ","pages":"Article 107601"},"PeriodicalIF":6.0,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144170220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-05-23DOI: 10.1016/j.neunet.2025.107599
Zhibin Lan , Jiawei Yu , Shiyu Liu , Junfeng Yao , Degen Huang , Jinsong Su
{"title":"Towards better text image machine translation with multimodal codebook and multi-stage training","authors":"Zhibin Lan , Jiawei Yu , Shiyu Liu , Junfeng Yao , Degen Huang , Jinsong Su","doi":"10.1016/j.neunet.2025.107599","DOIUrl":"10.1016/j.neunet.2025.107599","url":null,"abstract":"<div><div>As a widely-used machine translation task, text image machine translation (TIMT) aims to translate the source texts embedded in the image to target translations. However, studies in this aspect face two challenges: (1) constructed in a cascaded manner, dominant models suffer from the error propagation of optical character recognition (OCR), and (2) they lack publicly available large-scale datasets. To deal with these issues, we propose a multimodal codebook based TIMT model. In addition to a text encoder, an image encoder, and a text decoder, our model is equipped with a multimodal codebook that effectively associates images with relevant texts, thus providing useful supplementary information for translation. Particularly, we present a multi-stage training framework to fully exploit various datasets to effectively train our model. Concretely, we first conduct preliminary training on the text encoder and decoder using bilingual texts. Subsequently, via an additional code-conditioned mask translation task, we use the bilingual texts to continuously train the text encoder, multimodal codebook, and decoder. Afterwards, by further introducing an image-text alignment task and adversarial training, we train the whole model except for the text decoder on the OCR dataset. Finally, through the above training tasks except for text translation, we adopt a TIMT dataset to fine-tune the whole model. Besides, we manually annotate a Chinese-English TIMT dataset, named OCRMT30K, and extend it to Chinese-German TIMT dataset through an automatic translation tool. To the best of our knowledge, it is the first public manually-annotated TIMT dataset, which facilitates future studies in this task. To investigate the effectiveness of our model, we conduct extensive experiments on Chinese-English and Chinese-German TIMT tasks. Experimental results and in-depth analyses strongly demonstrate the effectiveness of our model. We release our code and dataset on <span><span>https://github.com/DeepLearnXMU/mc_tit</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"189 ","pages":"Article 107599"},"PeriodicalIF":6.0,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144139614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HMgNO: Hybrid multigrid neural operator with low-order numerical solver for partial differential equations","authors":"Yifan Hu , Weimin Zhang , Fukang Yin , Jianping Wu","doi":"10.1016/j.neunet.2025.107649","DOIUrl":"10.1016/j.neunet.2025.107649","url":null,"abstract":"<div><div>Traditional numerical methods face a trade-off between computational cost and accuracy when solving partial differential equations. Low-order solvers are fast but less accurate, while high-order solvers are accurate but much slower. To address this challenge, we propose a novel framework, the hybrid multigrid neural operator (HMgNO). The HMgNO couples a low-order numerical solver with a multigrid neural operator, and the neural operator is used to correct the low-order numerical solutions to obtain high-order accuracy at each fixed time step size. Thus, the HMgNO achieves accurate solutions while ensuring computational efficiency. Moreover, our framework supports multiple types of low-order numerical solvers, such as finite difference and spectral methods. Experiments on the Navier-Stokes, shallow-water, and diffusion-reaction equations demonstrate that the proposed framework achieves the lowest relative error and smallest spectral bias with few model parameters and fast inference speed.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"190 ","pages":"Article 107649"},"PeriodicalIF":6.0,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144185013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-05-23DOI: 10.1016/j.neunet.2025.107552
Weihao Luo , Zezhen Zeng , Yueqi Zhong
{"title":"Enhancing image-based virtual try-on with Multi-Controlled Diffusion Models","authors":"Weihao Luo , Zezhen Zeng , Yueqi Zhong","doi":"10.1016/j.neunet.2025.107552","DOIUrl":"10.1016/j.neunet.2025.107552","url":null,"abstract":"<div><div>Image-based virtual try-on technology digitally overlays clothing onto images of individuals, enabling users to preview how garments fit without physical trial, thus enhancing the online shopping experience. While current diffusion-based virtual try-on networks produce high-quality results, they struggle to accurately render garments with textual designs such as logos or prints which are widely prevalent in the real world, often carrying significant brand and cultural identities. To address this challenge, we introduce the Multi-Controlled Diffusion Models for Image-based Virtual Try-On (MCDM-VTON), a novel approach that synergistically incorporates global image features and local textual features extracted from garments to control the generation process. Specifically, we innovatively introduce an Optical Character Recognition (OCR) model to extract the text-style textures from clothing, utilizing the information gathered as text features. These features, in conjunction with the inherent global image features through a multimodal feature fusion module based on cross-attention, jointly control the denoising process of the diffusion models. Moreover, by extracting text information from both the generated virtual try-on results and the original garment images with the OCR model, we have devised a new content-style loss to supervise the training of diffusion models, thereby reinforcing the generation effect of text-style textures. Extensive experiments demonstrate that MCDM-VTON significantly outperforms existing state-of-the-art methods in terms of text preservation and overall visual quality.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"189 ","pages":"Article 107552"},"PeriodicalIF":6.0,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144123671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-05-22DOI: 10.1016/j.neunet.2025.107664
Shengbin Zheng, Dechang Pi
{"title":"VKAD: A novel fault detection and isolation model for uncertainty-aware industrial processes","authors":"Shengbin Zheng, Dechang Pi","doi":"10.1016/j.neunet.2025.107664","DOIUrl":"10.1016/j.neunet.2025.107664","url":null,"abstract":"<div><div>Fault detection and isolation (FDI) are essential for effective monitoring of industrial processes. Modern industrial processes involve dynamic systems characterized by complex, high-dimensional nonlinearities, posing significant challenges for accurate modeling and analysis. Recent studies have employed deep learning methods to capture and model these complexities in dynamic systems. In contrast, Koopman operator theory offers an alternative perspective, as the Koopman operator describes the linear evolution of observables in nonlinear systems within a high-dimensional space. This linearization simplifies complex nonlinear dynamics, making them easier to analyze and interpret in higher-dimensional settings. However, the Koopman operator theory does not inherently incorporate uncertainties in dynamical systems, which can hinder its performance in process monitoring. To tackle this issue, we integrate Koopman operator theory with Variational Autoencoders to propose a novel fault detection and isolation model called the Variational Koopman Anomaly Detector (VKAD). VKAD is capable of inferring the distribution of observables from time series data of dynamical systems. By advancing the distribution through the Koopman operator over time, VKAD can capture the uncertainty in the evolution of dynamic systems. The uncertainty estimates yielded by VKAD are applicable for both fault detection and isolation in industrial processes. The effectiveness of the proposed VKAD were illustrated using the Tennessee Eastman Process (TEP) and a real satellite on-orbit telemetry dataset (SAT). The experimental results demonstrate that the Fault Detection Rate (FDR) of VKAD achieves superior performance on both the TEP and SAT datasets compared to advanced methods, while the Fault Alarm Rate (FAR) is also highly competitive.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"189 ","pages":"Article 107664"},"PeriodicalIF":6.0,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144139615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}