Neural Networks最新文献

筛选
英文 中文
Transforming tabular data into images for deep learning models 将表格数据转换为深度学习模型的图像
IF 6.3 1区 计算机科学
Neural Networks Pub Date : 2026-07-01 Epub Date: 2026-02-10 DOI: 10.1016/j.neunet.2026.108715
Abdullah Elen , Emre Avuçlu
{"title":"Transforming tabular data into images for deep learning models","authors":"Abdullah Elen ,&nbsp;Emre Avuçlu","doi":"10.1016/j.neunet.2026.108715","DOIUrl":"10.1016/j.neunet.2026.108715","url":null,"abstract":"<div><div>Deep learning (DL) has achieved remarkable success in processing unstructured data such as images, text, and audio, yet its application to tabular numerical datasets remains challenging due to the lack of inherent spatial structure. In this study, we present a novel approach for transforming numerical tabular data into grayscale image representations, enabling the effective use of convolutional neural networks and other DL architectures on traditionally numerical datasets. The method normalizes features, organizes them into square image matrices, and generates labeled images for classification. Experiments were conducted on four publicly available datasets: Rice MSC Dataset (RMSCD), Optical Recognition of Handwritten Digits (Optdigits), TUNADROMD, and Spambase. Transformed datasets were evaluated using Residual Network (ResNet-18) and Directed Acyclic Graph Neural Network (DAG-Net) models with 5-fold cross-validation. The DAG-Net model achieved accuracies of 99.91% on RMSCD, 99.77% on Optdigits, 98.84% on TUNADROMD, and 93.06% on Spambase, demonstrating the efficacy of the proposed transformation. Additional ablation studies and efficiency analyses highlight improvements in training performance and computational cost. The results indicate that the proposed image-based transformation provides a practical and efficient strategy for integrating numerical datasets into deep learning workflows, broadening the applicability of DL techniques across diverse domains. The implementation is released as open-source software to facilitate reproducibility and further research.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108715"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gradient-informed neural networks: Embedding prior beliefs for learning in low-data scenarios 梯度通知神经网络:嵌入先验信念在低数据场景下的学习。
IF 6.3 1区 计算机科学
Neural Networks Pub Date : 2026-07-01 Epub Date: 2026-02-02 DOI: 10.1016/j.neunet.2026.108681
Filippo Aglietti , Francesco Della Santa , Andrea Piano , Virginia Aglietti
{"title":"Gradient-informed neural networks: Embedding prior beliefs for learning in low-data scenarios","authors":"Filippo Aglietti ,&nbsp;Francesco Della Santa ,&nbsp;Andrea Piano ,&nbsp;Virginia Aglietti","doi":"10.1016/j.neunet.2026.108681","DOIUrl":"10.1016/j.neunet.2026.108681","url":null,"abstract":"<div><div>We propose Gradient-Informed Neural Networks (<span>g</span>rad<span>inn</span> s), a methodology that can be used to efficiently approximate a wide range of functions in low-data regimes, when only general prior beliefs are available, a condition that is often encountered in complex engineering problems.</div><div><span>g</span>rad<span>inn</span> s incorporate prior beliefs about the first-order derivatives of the target function to constrain the behavior of its gradient, thus implicitly shaping it, without requiring explicit access to the target function’s derivatives. This is achieved by using two Neural Networks: one modeling the target function and a second, auxiliary network expressing the prior beliefs about the first-order derivatives (e.g., smoothness, oscillations, etc.). A customized loss function enables the training of the first network while enforcing gradient constraints derived from the auxiliary network; at the same time, it allows these constraints to be relaxed in accordance with the training data. Numerical experiments demonstrate the advantages of <span>g</span>rad<span>inn</span> s, particularly in low-data regimes, with results showing strong performance compared to standard Neural Networks across the tested scenarios, including synthetic benchmark functions and real-world engineering tasks.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108681"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146167562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NG-SNN: A neurogenesis-inspired dynamic adaptive framework for efficient spike classification NG-SNN:一种神经发生启发的动态自适应框架,用于有效的尖峰分类
IF 6.3 1区 计算机科学
Neural Networks Pub Date : 2026-07-01 Epub Date: 2026-01-29 DOI: 10.1016/j.neunet.2026.108656
Jing Tang , Depeng Li , Zhenyu Zhang , Zhigang Zeng
{"title":"NG-SNN: A neurogenesis-inspired dynamic adaptive framework for efficient spike classification","authors":"Jing Tang ,&nbsp;Depeng Li ,&nbsp;Zhenyu Zhang ,&nbsp;Zhigang Zeng","doi":"10.1016/j.neunet.2026.108656","DOIUrl":"10.1016/j.neunet.2026.108656","url":null,"abstract":"<div><div>Spiking neural networks (SNNs) are designed for low-power neuromorphic computing. A widely adopted hybrid paradigm decouples feature extraction from classification to improve biological plausibility and modularity. However, this decoupling concentrates decision making in the downstream classifier, which in many systems becomes the limiting factor for both accuracy and efficiency. Hand-preset, fixed topologies risk either redundancy or insufficient capacity, and surrogate-gradient training remains computationally costly. Biological neurogenesis is the brain’s mechanism for adaptively adding new neurons to build efficient, task-specific circuits. Inspired by this process, we propose the neurogenesis-inspired spiking neural network (NG-SNN), a dynamic adaptive framework that uses two key innovations to address these challenges. Specifically, we first introduce a supervised incremental construction mechanism that dynamically grows a task-optimal structure by selectively integrating neurons under a contribution criterion. Second, we devise an activity-dependent analytical learning method that replaces iterative optimization with single-shot and adaptive weight computation for each structural update, drastically improving training efficiency. Therefore, NG-SNN uniquely integrates dynamic structural adaptation with efficient non-iterative learning, forming a self-organizing and rapidly converging classification system. Moreover, this neurogenesis-driven process endows NG-SNN with a highly compact structure that requires significantly fewer parameters. Extensive experiments demonstrate that our NG-SNN matches or outperforms its competitors on diverse datasets, without the overhead of iterative training and manual architecture tuning.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108656"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174599","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Trainable-parameter-free structural-diversity message passing for graph neural networks 图神经网络的无可训练参数结构分集消息传递
IF 6.3 1区 计算机科学
Neural Networks Pub Date : 2026-07-01 Epub Date: 2026-02-10 DOI: 10.1016/j.neunet.2026.108711
Mingyue Kong, Yinglong Zhang, Chengda Xu, Xuewen Xia, Xing Xu
{"title":"Trainable-parameter-free structural-diversity message passing for graph neural networks","authors":"Mingyue Kong,&nbsp;Yinglong Zhang,&nbsp;Chengda Xu,&nbsp;Xuewen Xia,&nbsp;Xing Xu","doi":"10.1016/j.neunet.2026.108711","DOIUrl":"10.1016/j.neunet.2026.108711","url":null,"abstract":"<div><div>Graph Neural Networks (GNNs) have achieved strong performance in structured data modeling such as node classification. However, real-world graphs often exhibit heterogeneous neighborhoods and complex feature distributions, while mainstream approaches rely on many learnable parameters and apply uniform aggregation to all neighbors. This lack of explicit modeling for structural diversity often leads to representation homogenization, semantic degradation, and poor adaptability under challenging conditions such as low supervision or class imbalance. To address these limitations, we propose a trainable-parameter-free graph neural network framework, termed the Structural-Diversity Graph Neural Network (SDGNN), which operationalizes structural diversity in message passing. At its core, the Structural-Diversity Message Passing (SDMP) mechanism performs within-group statistics followed by cross-group selection, thereby capturing neighborhood heterogeneity while stabilizing feature semantics. SDGNN further incorporates complementary structure-driven and feature-driven partitioning strategies, together with a normalized-propagation-based global structural enhancer, to enhance adaptability across diverse graphs. Extensive experiments on nine public benchmark datasets and an interdisciplinary PubMed citation network demonstrate that SDGNN consistently outperforms mainstream GNNs, especially under low supervision, class imbalance, and cross-domain transfer. The full implementation, including code and configurations, is publicly available at: <span><span>https://github.com/mingyue15694/SGDNN/tree/main</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108711"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CGLK-GNN : A connectome generation network with large kernels for GNN based Alzheimer’s disease analysis CGLK-GNN:用于基于GNN的阿尔茨海默病分析的大核连接组生成网络
IF 6.3 1区 计算机科学
Neural Networks Pub Date : 2026-07-01 Epub Date: 2026-02-07 DOI: 10.1016/j.neunet.2026.108689
Wenqi Zhu , Zhong Yin , Yinghua Fu , Alzheimer's Disease Neuroimaging Initiative
{"title":"CGLK-GNN : A connectome generation network with large kernels for GNN based Alzheimer’s disease analysis","authors":"Wenqi Zhu ,&nbsp;Zhong Yin ,&nbsp;Yinghua Fu ,&nbsp;Alzheimer's Disease Neuroimaging Initiative","doi":"10.1016/j.neunet.2026.108689","DOIUrl":"10.1016/j.neunet.2026.108689","url":null,"abstract":"<div><div>Alzheimer’s disease (AD) is a currently incurable neurodegenerative disease, with early detection representing a high research priority. AD is characterized by progressive cognitive decline accompanied by alterations in brain functional connectivity. Based on its data structure similar to the graph, graph neural networks (GNNs) have emerged as important methods for brain function analysis and disease prediction in recent years. However, most GNN methods are limited by information loss caused by traditional functional connectivity calculation as well as common noise issues in functional magnetic resonance imaging (fMRI) data. This paper proposes a graph generation based AD classification model using resting state fMRI to address this issue. The connectome generation network with large kernels for GNN (CGLK-GNN) based AD Analysis contains a graph generation block and a GNN prediction block. The graph generation block employs decoupled convolutional networks with large kernels to extract comprehensive temporal features while preserving sequential dependencies, contrasting with previous generative GNN approaches. This module constructs the connectome graph by encoding both edge-wise correlations and node-embedded temporal features, thereby utilizing the generated graph more effectively. The subsequent GNN prediction block adopts an efficient architecture to learn these enhanced representations and perform final AD stage classification. Through independent cohort validations, CGLK-GNN outperforms state-of-the-art GNN and rsfMRI-based AD classifiers in differentiating AD status. Furthermore, CGLK-GNN demonstrates high clinical value by learning clinically relevant connectome node and connectivity features from two independent datasets.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108689"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SCAD: A self-constrained solution to automate context-guided zero-shot image anomaly detection SCAD:一种自我约束的解决方案,用于自动进行上下文引导的零拍摄图像异常检测
IF 6.3 1区 计算机科学
Neural Networks Pub Date : 2026-07-01 Epub Date: 2026-01-19 DOI: 10.1016/j.neunet.2026.108577
Siqi Wang , Guangpu Wang , Xinwang Liu , Jie Liu , Jiyuan Liu , Siwei Wang
{"title":"SCAD: A self-constrained solution to automate context-guided zero-shot image anomaly detection","authors":"Siqi Wang ,&nbsp;Guangpu Wang ,&nbsp;Xinwang Liu ,&nbsp;Jie Liu ,&nbsp;Jiyuan Liu ,&nbsp;Siwei Wang","doi":"10.1016/j.neunet.2026.108577","DOIUrl":"10.1016/j.neunet.2026.108577","url":null,"abstract":"<div><div>Image anomaly detection (IAD) usually requires a separated train set to build an inductive model, which then infers on the test set. However, the cost of collecting and labeling training images has inspired <em>zero-shot IAD</em> (ZS-IAD), which directly processes the test set without the train set. Most ZS-IAD methods resort to pre-trained foundation models (e.g., CLIP), which rely on external prompts and lack adaptation to the target IAD scene. By contrast, <em>context-guided ZS-IAD</em> methods have recently attracted a growing interest: They not only avoid using external prompts by exploiting scene-specific context clues within unlabeled images, but also achieve superior performance to prior ZS-IAD counterparts. Unfortunately, existing context-guided ZS-IAD methods suffer from two vital flaws: The absence of train set forces them to set key hyperparameters blindly, which leads to unreliable performance. Besides, they do not actively handle mixed anomalies that disturb the learning process. To this end, we propose to automate context-guided ZS-IAD by a novel <strong>S</strong>elf-<strong>C</strong>onstrained <strong>A</strong>nomaly <strong>D</strong>etector (SCAD), which makes the following contributions: <strong>(1)</strong> We propose a novel self-constrained mechanism that can automatically determine proper values for key hyperparameters. <strong>(2)</strong> We design a new online self-constrained sampler that terminates the time-consuming sampling process by a proper stopping point, which can significantly reduce the computational cost. <strong>(3)</strong> We develop self-constrained normality refinement strategies that can actively constrain anomalies’ impact and automatically rectify the stopping threshold. To the best of our knowledge, this is also the first work that addresses hyperparameter selection in the IAD realm. Experiments show that SCAD not only yields comparable performance to classic IAD solutions, but also matches ZS-IAD solutions enhanced by hindsight knowledge (i.e., hyperparameters validated on the test set).</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108577"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146049071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient semantic segmentation via logit-guided feature distillation 基于对数引导特征蒸馏的高效语义分割。
IF 6.3 1区 计算机科学
Neural Networks Pub Date : 2026-07-01 Epub Date: 2026-01-29 DOI: 10.1016/j.neunet.2026.108663
Xuyi Yu , Shang Lou , Yinghai Zhao , Huipeng Zhang , Kuizhi Mei
{"title":"Efficient semantic segmentation via logit-guided feature distillation","authors":"Xuyi Yu ,&nbsp;Shang Lou ,&nbsp;Yinghai Zhao ,&nbsp;Huipeng Zhang ,&nbsp;Kuizhi Mei","doi":"10.1016/j.neunet.2026.108663","DOIUrl":"10.1016/j.neunet.2026.108663","url":null,"abstract":"<div><div>Knowledge Distillation (KD) is a critical technique for model compression, facilitating the transfer of implicit knowledge from a teacher model to a more compact, deployable student model. KD can be generally divided into two categories: logit distillation and feature distillation. Feature distillation has been predominant in achieving state-of-the-art (SOTA) performance, but recent advances in logit distillation have begun to narrow the gap. We propose a Logit-guided Feature Distillation (LFD) framework that combines the strengths of both logit and feature distillation to enhance the efficacy of knowledge transfer, particularly leveraging the rich classification information inherent in logits for semantic segmentation tasks. Furthermore, it is observed that Deep Neural Networks (DNNs) only manifest task-relevant characteristics at sufficient depths, which may be a limiting factor in achieving higher accuracy. In this work, we introduce a collaborative distillation method that preemptively focuses on critical pixels and categories in the early stage. We employ logits from deep layers to generate fine-grained spatial masks that are directly conveyed to the feature distillation stage, thereby inducing spatial gradient disparities. Additionally, we generate class masks that dynamically modulate the weights of shallow auxiliary heads, ensuring that class-relevant features can be calibrated by the primary head. A novel shared auxiliary head distillation approach is also presented. Experiments on the Cityscapes, Pascal VOC, and CamVid datasets show that the proposed method achieves competitive performance while maintaining low memory usage. Our codes will be released in <span><span>https://github.com/fate2715/LFD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108663"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146114774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Resolving ambiguity in code refinement via conidfine: A conversationally-Aware framework with disambiguation and targeted retrieval 通过conidfine解决代码细化中的歧义:具有消歧义和目标检索的会话感知框架。
IF 6.3 1区 计算机科学
Neural Networks Pub Date : 2026-07-01 Epub Date: 2026-01-29 DOI: 10.1016/j.neunet.2026.108650
Aoyu Song , Afizan Azman , Shanzhi Gu , Fangjian Jiang , Jianchi Du , Tailong Wu , Mingyang Geng , Jia Li
{"title":"Resolving ambiguity in code refinement via conidfine: A conversationally-Aware framework with disambiguation and targeted retrieval","authors":"Aoyu Song ,&nbsp;Afizan Azman ,&nbsp;Shanzhi Gu ,&nbsp;Fangjian Jiang ,&nbsp;Jianchi Du ,&nbsp;Tailong Wu ,&nbsp;Mingyang Geng ,&nbsp;Jia Li","doi":"10.1016/j.neunet.2026.108650","DOIUrl":"10.1016/j.neunet.2026.108650","url":null,"abstract":"<div><div>Code refinement is a vital aspect of software development, involving the review and enhancement of code contributions made by developers. A critical challenge in this process arises from unclear or ambiguous review comments, which can hinder developers’ understanding of the required changes. Our preliminary study reveals that conversations between developers and reviewers often contain valuable information that can help resolve such ambiguous review suggestions. However, leveraging conversational data to address this issue poses two key challenges: (1) enabling the model to autonomously determine whether a review suggestion is ambiguous, and (2) effectively extracting the relevant segments from the conversation that can aid in resolving the ambiguity.</div><div>In this paper, we propose a novel method for addressing ambiguous review suggestions by leveraging conversations between reviewers and developers. To tackle the above two challenges, we introduce an <strong>Ambiguous Discriminator</strong> that uses multi-task learning to classify ambiguity and generate type-aware confusion points from a GPT-4-labeled dataset. These confusion points guide a <strong>Type-Driven Multi-Strategy Retrieval Framework</strong> that applies targeted strategies based on categories like <em>Inaccurate Localization, Unclear Expression</em>, and <em>Lack of Specific Guidance</em> to extract actionable information from the conversation context. To support this, we construct a semantic auxiliary instruction library containing spatial indicators, clarification patterns, and action-oriented verbs, enabling precise alignment between review suggestions and informative conversation segments. Our method is evaluated on two widely-used code refinement datasets CodeReview and CodeReview-New, where we demonstrate that our method significantly enhances the performance of various state-of-the-art models, including TransReview, T5-Review, CodeT5, CodeReviewer and ChatGPT. Furthermore, we explore in depth how conversational information improves the model’s ability to address fine-grained situations, and we conduct human evaluations to assess the accuracy of ambiguity detection and the correctness of generated confusion points. We are the first to introduce the issue of ambiguous review suggestions in the code refinement domain and propose a solution that not only addresses these challenges but also sets the foundation for future research. Our method provides valuable insights into improving the clarity and effectiveness of review suggestions, offering a promising direction for advancing code refinement techniques.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108650"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146120849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Warm-start or cold-start? A comparison of generalizability in gradient-based hyperparameter tuning 热启动还是冷启动?基于梯度的超参数整定的泛化性比较。
IF 6.3 1区 计算机科学
Neural Networks Pub Date : 2026-07-01 Epub Date: 2026-01-29 DOI: 10.1016/j.neunet.2026.108647
Yubo Zhou , Jun Shu , Chengli Tan , Haishan Ye , Quanziang Wang , Junmin Liu , Deyu Meng , Ivor Tsang , Guang Dai
{"title":"Warm-start or cold-start? A comparison of generalizability in gradient-based hyperparameter tuning","authors":"Yubo Zhou ,&nbsp;Jun Shu ,&nbsp;Chengli Tan ,&nbsp;Haishan Ye ,&nbsp;Quanziang Wang ,&nbsp;Junmin Liu ,&nbsp;Deyu Meng ,&nbsp;Ivor Tsang ,&nbsp;Guang Dai","doi":"10.1016/j.neunet.2026.108647","DOIUrl":"10.1016/j.neunet.2026.108647","url":null,"abstract":"<div><div>Bilevel optimization (BO) has garnered increasing attention in hyperparameter tuning. BO methods are commonly employed with two distinct strategies for the inner-level: cold-start, which uses a fixed initialization, and warm-start, which uses the last inner approximation solution as the starting point for the inner solver each time, respectively. Previous studies mainly stated that warm-start exhibits better convergence properties, while we provide a detailed comparison of these two strategies from a generalization perspective. Our findings indicate that, compared to the cold-start strategy, warm-start strategy exhibits worse generalization performance, such as more severe overfitting on the validation set. To explain this, we establish generalization bounds for the two strategies. We reveal that warm-start strategy produces a worse generalization upper bound due to its closer interaction with the inner-level dynamics, naturally leading to poor generalization performance. Inspired by the theoretical results, we propose several approaches to enhance the generalization capability of warm-start strategy and narrow its gap with cold-start, especially a novel random perturbation initialization method. Experiments validate the soundness of our theoretical analysis and the effectiveness of the proposed approaches.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108647"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146127089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SPD-Net: A semantic partitioned transformer with dynamic graph network for improved skeleton-based gait recognition SPD-Net:一种基于语义划分的动态图网络转换器,用于改进的基于骨骼的步态识别
IF 6.3 1区 计算机科学
Neural Networks Pub Date : 2026-07-01 Epub Date: 2026-02-03 DOI: 10.1016/j.neunet.2026.108679
Priyanka D, Mala T
{"title":"SPD-Net: A semantic partitioned transformer with dynamic graph network for improved skeleton-based gait recognition","authors":"Priyanka D,&nbsp;Mala T","doi":"10.1016/j.neunet.2026.108679","DOIUrl":"10.1016/j.neunet.2026.108679","url":null,"abstract":"<div><div>Gait recognition has gained prominence as a biometric modality owing to its unobtrusive and non-invasive nature. Existing methods primarily rely on silhouette-based representations, making them sensitive to variations in clothing, occlusion, and background noise. In contrast, model-based approaches utilize skeleton sequences to capture motion dynamics through joint connectivity, thereby reducing dependence on visual appearance. However, these approaches often rely on physically connected joints, limiting their ability to model semantically meaningful joint relationships. Transformer-based models mitigate this limitation by capturing long-range dependencies, but at the expense of substantial computational overhead. To address these challenges, this work proposes the Semantic Partitioned transformer with Dynamic Graph Network (SPD-Net) for robust gait recognition. SPD-Net integrates Dynamic Graph Convolutional Network (DGCN), Temporal Convolutional Network (TCN), and Semantic Partitioned Multi-head Self-Attention (SP-MSA) to enhance the representation of gait features. DGCN dynamically learns spatial correlations between joints, while TCN captures temporal dependencies. Furthermore, SP-MSA introduces a semantic partitioning strategy that selectively focuses on key joints and frames, significantly reducing computational complexity while preserving crucial gait patterns. This approach effectively models both physically neighboring and distant joint relationships, along with intra- and inter-frame correlations. Finally, a Joint-Part Mapping (JPM) module enhances the discriminative power of gait representations by capturing hierarchical joint relationships across multiple scales. Experimental evaluations on benchmark gait datasets show that SPD-Net surpasses prior state-of-the-art approaches, achieving improved robustness and accuracy across diverse gait recognition challenges.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"199 ","pages":"Article 108679"},"PeriodicalIF":6.3,"publicationDate":"2026-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书