Neural NetworksPub Date : 2025-09-29DOI: 10.1016/j.neunet.2025.108149
Yun Zhou, Yuqiang Wu, Qiaoyun Wu, Chunyu Tan, Shu Zhan, Richang Hong
{"title":"Dual-head prediction and reconstruction with coarse-to-fine masks for visual reinforcement learning.","authors":"Yun Zhou, Yuqiang Wu, Qiaoyun Wu, Chunyu Tan, Shu Zhan, Richang Hong","doi":"10.1016/j.neunet.2025.108149","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.108149","url":null,"abstract":"<p><p>In situations of limited experience and high-dimensional input data, effective representation learning plays a vital role in enabling visual reinforcement learning (RL) to excel in diverse tasks. To better leverage the agent's sampled trajectory during the training process, we introduce the DPRM approach, which involves a Dual-head Prediction and Reconstruction task with coarse-to-fine Masks in RL. The DPRM method tackles these challenges through integration of coarse-to-fine masks with a dual-head prediction-reconstruction (DHPR) architecture, complemented by a coordinate-based spatial coding strategy (CSCS). The CSCS enhances the spatial information of the observation state, facilitating the capture of motion changes between continuous context states. Furthermore, the coarse-to-fine masks gradually refine, guiding the following DHPR model to learn essential features and semantics more effectively. Built on a transformer architecture, DHPR introduces a novel triplet input token comprising two consecutive actions paired with an observation state. This design facilitates bidirectional prediction of past and future states from temporal extremities while efficiently reconstructing masked latent features throughout state sequences. Experimental results on both multiple continuous control (DeepMind Control Suite benchmarks) and discrete control (Atari) tasks demonstrate that the DPRM algorithm significantly enhances performance, leading to higher reward accumulation and faster convergence. Code is available athere.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"108149"},"PeriodicalIF":6.3,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145259430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-09-27DOI: 10.1016/j.neunet.2025.108157
Jongmin Han, Seokho Kang
{"title":"Consistency-regularized graph neural networks for molecular property prediction.","authors":"Jongmin Han, Seokho Kang","doi":"10.1016/j.neunet.2025.108157","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.108157","url":null,"abstract":"<p><p>Although graph neural networks (GNNs) have proven powerful in molecular property prediction tasks, they tend to underperform when trained on small datasets. Conventional data augmentation strategies are generally ineffective in this context, as simply perturbing molecular graphs can unintentionally alter their intrinsic properties. In this study, we propose a consistency-regularized graph neural network (CRGNN) method to better utilize molecular graph augmentation during training. We apply molecular graph augmentation to obtain strongly and weakly-augmented views for each molecular graph. By incorporating a consistency regularization loss into the learning objective, the GNN is encouraged to learn representations such that the strongly-augmented views of a molecular graph are mapped close to a weakly-augmented view of the same graph. In doing so, molecular graph augmentation can contribute to improving the prediction performance of the GNN while mitigating its negative effects. Through experimental evaluation on various molecular benchmark datasets, we demonstrate that the proposed method outperforms existing methods that leverage molecular graph augmentation, especially when the training dataset is smaller.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"108157"},"PeriodicalIF":6.3,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145245697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-09-25DOI: 10.1016/j.neunet.2025.108140
Chunyi Hou , Yongchuan Yu , Jinquan Ji , Siyao Zhang , Xumeng Shen , Jianzhuo Yan
{"title":"Graph-patchformer: Patch interaction transformer with adaptive graph learning for multivariate time series forecasting","authors":"Chunyi Hou , Yongchuan Yu , Jinquan Ji , Siyao Zhang , Xumeng Shen , Jianzhuo Yan","doi":"10.1016/j.neunet.2025.108140","DOIUrl":"10.1016/j.neunet.2025.108140","url":null,"abstract":"<div><div>Multivariate time series (MTS) forecasting plays a pivotal role in the digitalization and intelligent development of modern society, while previous MTS forecasting methods based on deep learning often rely on capturing intra-series dependencies for modeling, neglecting the structural information within MTS and failing to consider inter-series local dynamic dependencies. Although some approaches utilize multi-scale representation learning to capture inter-series dynamic dependencies at different time scales, they still require additional multi-scale feature fusion modules to output the multi-scale representation of final forecasting results. In this paper, we propose a novel deep learning framework called Graph-Patchformer, which leverages structural encodings to reflect the structural information within MTS while capturing intra-series dependencies and inter-series local dynamic dependencies using the Patch Interaction Blocks we proposed. Specifically, Graph-Patchformer embeds structural encodings into MTS to reflect the inter-series relationships and temporal variations within the MTS. The embedded data is subsequently fed into the Patch Interaction Blocks through a patching operation. Within the Patch Interaction Blocks, the multi-head self-attention mechanism and adaptive graph learning module are employed to capture intra-series dependencies and inter-series local dynamic dependencies. In this way, Graph-Patchformer not only facilitates interactions between different patches within a single series but also enables cross-time-window interactions between patches of different series. The experimental results show that the Graph-Patchformer outperforms the state-of-the-art approaches and exhitits significant forecasting performance compared to several state-of-the-art methods across various real-world benchmark datasets. The code will be available at this repository: <span><span>https://github.com/houchunyiPhd/Graph-Patchformer/tree/main</span><svg><path></path></svg></span></div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"Article 108140"},"PeriodicalIF":6.3,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145221969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-09-25DOI: 10.1016/j.neunet.2025.108154
Abdulrahman Noman , Zou Beiji , Chengzhang Zhu , Mohammed Al-Habib , Ahmed Alasri
{"title":"Cascading size-dependent deep propagation (CADP): Addressing over-smoothing in graph few-shot dermatology classification","authors":"Abdulrahman Noman , Zou Beiji , Chengzhang Zhu , Mohammed Al-Habib , Ahmed Alasri","doi":"10.1016/j.neunet.2025.108154","DOIUrl":"10.1016/j.neunet.2025.108154","url":null,"abstract":"<div><div>Graphs play a critical role in capturing complex data relationships, particularly in few-shot learning tasks. However, one of the major challenges in graph-based models, such as Graph Neural Networks (GNNs), is the issue of over-smoothing, which diminishes the discriminative power of node representations. This problem arises when GNNs aggregate information from too large a neighborhood, leading to homogenization of node features. To overcome this limitation, we propose <em>Cascading Size-Dependent Deep Propagation (CADP)</em>, a novel approach designed to mitigate over-smoothing in graph-based few-shot learning, with a particular focus on improving skin disease classification. The model constructs the graph by employing a convolutional neural network (CNN) to extract feature representations from a small set of support and query images, where the nodes represent the extracted features, and the edges reflect the similarity between them. To improve feature representation and prevent over-smoothing, the model decouples the feature propagation process from the neural network to avoid repeated nonlinear transformations that lead to over-smoothing, enabling deeper information flow while preserving discriminative features. Then the initial support labels are integrated with the early prediction labels of query images, which are generated by a Multi-Layer Perceptron (MLP). Furthermore, this aggregated data is optimized through deep label propagation, which leverages the underlying graph structure to enhance classification accuracy. The propagation depths are controlled by the hyperparameters <span><math><msub><mi>K</mi><mn>1</mn></msub></math></span> and <span><math><msub><mi>K</mi><mn>2</mn></msub></math></span>, which are determined based on graph size, to regulate how extensively features and labels are propagated. We evaluate our approach on three dermatology datasets: ISIC 2018, Derm7pt, and SD-198, achieving 78.3 %, 79.29 %, and 91.92 % accuracy, respectively, in the 2-way 5-shot setting. CADP outperforms existing methods on all datasets, demonstrating its effectiveness in skin disease classification.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"Article 108154"},"PeriodicalIF":6.3,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145214222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-09-25DOI: 10.1016/j.neunet.2025.108152
Mingzhu Tai , Zhenqiu Shu , Songze Tang , Zhengtao Yu
{"title":"Spatial-spectral multi-order gated aggregation network with bidirectional interactive fusion for hyperspectral image classification","authors":"Mingzhu Tai , Zhenqiu Shu , Songze Tang , Zhengtao Yu","doi":"10.1016/j.neunet.2025.108152","DOIUrl":"10.1016/j.neunet.2025.108152","url":null,"abstract":"<div><div>Recently, convolutional neural networks (CNNs) have made significant strides in hyperspectral image classification (HSIC) tasks by contextualizing the convolutional kernels as global as possible. However, as the kernel sizes increase, encoding multi-order feature interactions becomes less efficient. Furthermore, self-attention mechanisms and convolutional operations can only handle global and local features independently, resulting in overly complex or simplified interactions. To overcome these limitations, in this work, we propose a novel HSIC framework called the Spatial-Spectral Multi-order Gated Aggregation Network with Bidirectional Interaction Fusion (SS-MoGAN). The proposed SS-MoGAN method integrates simple yet powerful convolutions and gated aggregations into a compact module, facilitating efficient feature extraction and adaptive contextual processing. Specifically, the spatial aggregation (SpaAg) and spectral aggregation (SpeAg) blocks guide the model to explicitly capture the interactions between low- and high-order features within the spatial and spectral dimensions. The bidirectional interaction fusion (BIF) blocks further integrate structural information through a bidirectional cross-attention mechanism, enhancing the representation of fine-grained details. Extensive experiments on three hyperspectral benchmark datasets demonstrate that the proposed SS-MoGAN method outperforms other state-of-the-art methods in HSIC applications. The source code for this work is available at <span><span>https://github.com/szq0816/SS-MoGAN_HSIC</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"Article 108152"},"PeriodicalIF":6.3,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145222046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-09-25DOI: 10.1016/j.neunet.2025.108142
Thiago Carvalho , Marley Vellasco , José Franco Amaral
{"title":"Towards out-of-distribution detection using gradient vectors","authors":"Thiago Carvalho , Marley Vellasco , José Franco Amaral","doi":"10.1016/j.neunet.2025.108142","DOIUrl":"10.1016/j.neunet.2025.108142","url":null,"abstract":"<div><div>Deploying Deep Learning algorithms in the real world requires some care that is generally not considered in the training procedure. In real-world scenarios, where the input data cannot be controlled, it is important for a model to identify when a sample does not belong to any known class. This is accomplished using out-of-distribution (OOD) detection, a technique designed to distinguish unknown samples from those that belong to the in-distribution classes. These methods mainly rely on output or intermediate features to calculate OOD scores, but the gradient space is still under-explored for this task. In this work, we propose a new family of methods using gradient features, named GradVec, using the gradient space as input representation for different OOD detection methods. The main idea is that the model gradient presents, in a more informative way, the knowledge that a sample belongs to a known class, being able to distinguish it from other unknown ones. GradVec methods do not change the model training procedure and no additional data is needed to adjust the OOD detector, and it can be used on any pre-trained model. Our approach presents superior results in different scenarios for OOD detection in image classification and text classification, reducing FPR95 up to 26.67 % and 21.29 %, respectively.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"Article 108142"},"PeriodicalIF":6.3,"publicationDate":"2025-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145221968","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-09-24DOI: 10.1016/j.neunet.2025.108146
Jiao Huang , Qianli Xing , Jinglong Ji , Bo Yang
{"title":"Reinforcement learning with formation energy feedback for material diffusion models","authors":"Jiao Huang , Qianli Xing , Jinglong Ji , Bo Yang","doi":"10.1016/j.neunet.2025.108146","DOIUrl":"10.1016/j.neunet.2025.108146","url":null,"abstract":"<div><div>Generative models are emerging as foundation tools for the discovery of new materials with remarkable efficiency. Existing works introduce physical constraints during the generation process of diffusion models to improve the quality of the generated crystals. However, it is difficult to accurately capture the distribution of stable crystal material structures, given the complex periodic crystal structure and the limited available crystal material data, even with the incorporation of symmetries and other domain-specific knowledge. Thus, these models still struggle to achieve a high success rate in producing stable crystal materials. To further improve the stability of generative crystal materials, we propose a novel fine-tuning framework RLFEF. We formulate the material diffusion process as a Markov Decision Process with formation energy serving as rewards. Moreover, we prove that optimizing the expected return in reinforcement learning is equivalent to applying policy gradient updates to a diffusion model. Additionally, we prove that the fine-tuned model adheres to the unique symmetry of crystal materials. Extensive experiments are conducted on three real-world datasets. The results show that our model achieves state-of-the-art performance on most tasks related to property optimization, ab initio generation, crystal structure prediction, and material generation.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"Article 108146"},"PeriodicalIF":6.3,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145214229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-09-24DOI: 10.1016/j.neunet.2025.108155
Huyu Wu, Bowen Jia, Xue-Ming Yuan
{"title":"LLM-led vision-spectral fusion: A zero-shot approach to temporal fruit image classification.","authors":"Huyu Wu, Bowen Jia, Xue-Ming Yuan","doi":"10.1016/j.neunet.2025.108155","DOIUrl":"https://doi.org/10.1016/j.neunet.2025.108155","url":null,"abstract":"<p><p>A zero-shot multimodal framework for temporal image classification is proposed, targeting automated fruit quality assessment. The approach leverages large language models for expert-level semantic description generation, which guides zero-shot object detection and segmentation through GLIP and SAM models. Visual features and spectral data are fused to capture both external appearance and internal biochemical properties of fruits. Experiments on the newly constructed Avocado Freshness Temporal-Spectral dataset-comprising daily synchronized images and spectral measurements across the full spoilage lifecycle-demonstrate reductions in mean squared error by up to 33 % and mean absolute error by up to 17 % compared to established baselines. These results validate the effectiveness and generalizability of the framework for temporal image analysis in smart agriculture and food quality monitoring.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"108155"},"PeriodicalIF":6.3,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145245789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-09-23DOI: 10.1016/j.neunet.2025.108143
Yueyang Pi , Yang Huang , Yongquan Shi , Fuhai Chen , Shiping Wang
{"title":"Implicit graph neural networks with flexible propagation operators","authors":"Yueyang Pi , Yang Huang , Yongquan Shi , Fuhai Chen , Shiping Wang","doi":"10.1016/j.neunet.2025.108143","DOIUrl":"10.1016/j.neunet.2025.108143","url":null,"abstract":"<div><div>Due to the capability to capture high-order information of nodes and reduce memory consumption, implicit graph neural networks have become an explored hotspot in recent years. However, these implicit graph neural networks are limited by the static topology, which makes it difficult to handle heterophilic graph-structured data. Furthermore, the existing methods inspired by optimization problem are limited by the explicit structure of graph neural networks, which makes it difficult to set an appropriate number of network layers to solve optimization problems. To address these issues, we propose an implicit graph neural network with flexible propagation operators in this paper. From the optimization objective function, we derive an implicit message passing formula with flexible propagation operators. Compared to the static operator, the proposed method that joints the dynamic semantic and topology of data is more applicable to heterophilic graphs. Moreover, the proposed model performs a fixed-point iterative process for the optimization of the objective function, which implicitly adjusts the number of network layers without requiring sufficient prior knowledge. Extensive experiment results demonstrate the superiority of the proposed model.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"Article 108143"},"PeriodicalIF":6.3,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145221970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neural NetworksPub Date : 2025-09-23DOI: 10.1016/j.neunet.2025.108133
Yao Zhang , Lang Qin , Zhongtian Bao , Hongru Liang , Jun Wang , Zhenglu Yang , Zhe Sun , Andrzej Cichocki
{"title":"Adaptive knowledge selection in dialogue systems: Accommodating diverse knowledge types, requirements, and generation models","authors":"Yao Zhang , Lang Qin , Zhongtian Bao , Hongru Liang , Jun Wang , Zhenglu Yang , Zhe Sun , Andrzej Cichocki","doi":"10.1016/j.neunet.2025.108133","DOIUrl":"10.1016/j.neunet.2025.108133","url":null,"abstract":"<div><div>Effective knowledge-grounded dialogue systems rely heavily on accurate knowledge selection. This paper begins with an innovative new perspective that categorizes research on knowledge selection based on when knowledge is selected in relation to response generation: pre-, joint-, and post-selection. Among these, pre-selection is of great interest nowadays because they endeavor to provide sufficiently relevant knowledge inputs for downstream response generation models in advance. This reduces the burden of learning, adjusting, and interpreting for the subsequent response generation models, particularly for Large Language Models. Current knowledge pre-selection methods, however, still face three significant challenges: how to cope with different types of knowledge, adapt to the various knowledge requirements in different dialogue contexts, and adapt to different generation models. To resolve the above challenges, we propose ASK, an adaptive knowledge pre-selection method. It unifies various types of knowledge, scores their relevance and contribution to generating desired responses, and adapts the knowledge pool size to ensure the optimal amount is available for generation models. ASK is enhanced by leveraging rewards for selecting appropriate knowledge in both quality and quantity, through a reinforcement learning framework. We perform exhaustive experiments on two benchmarks (WoW and OpenDialKG) and get the following conclusions: 1) ASK has excellent knowledge selection capabilities on diverse knowledge types and requirements. 2) ASK significantly enhances the performance of various downstream generation models, including ChatGPT and GPT-4o. 3) The lightweight improvement of ASK saves 40 % of the computational consumption. Code is available at <span><span>https://github.com/AnonymousCode32213/ASK</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"Article 108133"},"PeriodicalIF":6.3,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145201974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}