Neural Networks最新文献

筛选
英文 中文
Structural consistency learning for unsupervised domain adaptive object detection 无监督域自适应目标检测的结构一致性学习
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-07-09 DOI: 10.1016/j.neunet.2025.107767
Zhiyu Jiang, Jie Chen, Yuan Yuan
{"title":"Structural consistency learning for unsupervised domain adaptive object detection","authors":"Zhiyu Jiang,&nbsp;Jie Chen,&nbsp;Yuan Yuan","doi":"10.1016/j.neunet.2025.107767","DOIUrl":"10.1016/j.neunet.2025.107767","url":null,"abstract":"<div><div>Unsupervised domain adaptive object detection aims to facilitate the transfer of trained object detection models from the source domain to an unlabeled target domain. Although existing methods have made strides in feature alignment through adversarial learning, they tend to ignore the issue of category imbalance, leading to inadequate generalization of the model for rare categories. In addition, they fail to adequately address the background information embedded in the features, limiting the extraction of crucial object features. In order to overcome these limitations, this work proposes a structural consistency learning framework for unsupervised domain adaptive object detection. The framework enhances foreground feature representation through an Enhanced Dual Attentional Feature Alignment (EFA) mechanism and accomplishes comprehensive cross-domain feature alignment through the Structural Feature Consistency Module (SFC). The EFA introduces an attention mechanism in the image-level and instance-level feature alignment phases, enhancing the recognition of foreground objects. The SFC integrates information from multiple batches to obtain global prototypes and constructs a structure matrix based on the distances between these global prototypes. This process comprehensively reduces the structural differences between the source and target domains. The effectiveness of the approach has been validated through comprehensive experimentation on multiple cross-domain object detection benchmark datasets. The method achieves significant performance gains over current state-of-the-art techniques.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"191 ","pages":"Article 107767"},"PeriodicalIF":6.0,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144588864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
KGFedRS: Knowledge Graph enhanced Federated Recommender System KGFedRS:知识图谱增强的联邦推荐系统
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-07-08 DOI: 10.1016/j.neunet.2025.107840
Xiao Ma , Xuan Wen , Jiangfeng Zeng , Ping Xiong , Xingyu Han
{"title":"KGFedRS: Knowledge Graph enhanced Federated Recommender System","authors":"Xiao Ma ,&nbsp;Xuan Wen ,&nbsp;Jiangfeng Zeng ,&nbsp;Ping Xiong ,&nbsp;Xingyu Han","doi":"10.1016/j.neunet.2025.107840","DOIUrl":"10.1016/j.neunet.2025.107840","url":null,"abstract":"<div><div>Federated recommender systems are designed to learn user preferences without violating user privacy where sensitive user-item interactions are kept in local client and models are trained collaboratively without centrally aggregating raw data. However, the on-device local training strategy results in serious data sparsity problem which reduces the accuracy of recommendation. In this paper, we propose KGFedRS, a novel knowledge graph<u>(KG)</u> enhanced <u>Fed</u>erated <u>R</u>ecommender <u>S</u>ystem which not only protects the privacy of both users and knowledge graph, but also alleviates the data sparsity problem by incorporating auxiliary information from KGs. First, we propose a privacy-preserving framework for KG-based federated recommendation by introducing a third-party server to orchestrate the encrypted data matching between KG and user profile without privacy leakage. Second, a novel KG-guided implicit interaction subgraph generation module is presented aiming at learning the implicit collaborative signals for each client. Meanwhile, a local subgraph expansion module is introduced to capture the explicit high-order collaborative information. Extensive comparative experiments on three public datasets demonstrate that the proposed KGFedRS outperforms the state-of-the-art federated recommendation methods in terms of effectiveness and efficiency. The datasets and source code are available at <span><span>https://github.com/IHTWDhhh/KG4FedRS</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"191 ","pages":"Article 107840"},"PeriodicalIF":6.0,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144614521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Disentangled representation learning for multi-view clustering via von Mises–Fisher hyperspherical embedding 基于von Mises-Fisher超球面嵌入的多视图聚类解纠缠表示学习
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-07-08 DOI: 10.1016/j.neunet.2025.107802
Zhixiang Li , Zhiwen Luo , Nizar Bouguila , Weifeng Su , Wentao Fan
{"title":"Disentangled representation learning for multi-view clustering via von Mises–Fisher hyperspherical embedding","authors":"Zhixiang Li ,&nbsp;Zhiwen Luo ,&nbsp;Nizar Bouguila ,&nbsp;Weifeng Su ,&nbsp;Wentao Fan","doi":"10.1016/j.neunet.2025.107802","DOIUrl":"10.1016/j.neunet.2025.107802","url":null,"abstract":"<div><div>Multi-view clustering has gained significant attention due to its ability to integrate data from diverse perspectives, frequently outperforming single-view approaches. However, existing methods often assume a Gaussian distribution within the latent embedding space, which can degrade performance when handling high-dimensional data or data with complex, non-Gaussian distributions. These limitations complicate effective data alignment, hinder meaningful information fusion across views, and impair accurate similarity measurement. To overcome these challenges, we propose a novel contrastive multi-view clustering framework that leverages hyperspherical embeddings by explicitly modeling the latent space using the von Mises–Fisher (vMF) distribution. Additionally, the framework incorporates a contrastive learning paradigm guided by alignment and uniformity losses, facilitating more discriminative and disentangled representations within the hyperspherical latent space. Specifically, the alignment loss enhances consistency across embeddings of different views from the same instance, while the uniformity loss ensures distinctiveness among embeddings from different samples within each cluster. By jointly optimizing these objectives, our method substantially improves intra-cluster cohesion and inter-cluster separability across multiple views. Extensive experiments conducted on several benchmark datasets confirm that the proposed approach significantly outperforms state-of-the-art methods, particularly in scenarios involving high-dimensional and complex datasets. The source code of our model is publicly accessible at <span><span>https://github.com/jcdh/DRMVC</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"191 ","pages":"Article 107802"},"PeriodicalIF":6.0,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncertainty-guided attention learning for malaria parasite detection in thick blood smears 不确定性引导的注意学习在粘血涂片中检测疟疾寄生虫
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-07-08 DOI: 10.1016/j.neunet.2025.107833
Hao Xiong , Zhiyong Wang , Roneel V. Sharan , Shlomo Berkovsky
{"title":"Uncertainty-guided attention learning for malaria parasite detection in thick blood smears","authors":"Hao Xiong ,&nbsp;Zhiyong Wang ,&nbsp;Roneel V. Sharan ,&nbsp;Shlomo Berkovsky","doi":"10.1016/j.neunet.2025.107833","DOIUrl":"10.1016/j.neunet.2025.107833","url":null,"abstract":"<div><div>Malaria may seriously threaten an individual’s health and wellbeing, and early screening is pivotal for timely treatment and recovery. In malaria screening, thick blood smears are exploited to count the parasites and assess the severity of the disease. Parasites are tiny objects that can be found in high resolution blood smear images, which renders them difficult for detection. Other than using object detection based methods, prior works also applied image classification techniques to this problem. They first extracted image patches from blood smears as parasite candidates and then utilized convolutional neural networks to classify these patches as parasites or non-parasites. However, these approaches overlook the fact that the blood smear images may contain noises, errors, and background artifacts, which introduces uncertainty and makes the model predictions less stable. In this work, we propose an uncertainty-guided attention learning based network for malaria parasite detection from thick blood smears, which incorporates pixel attention mechanism to identify more fine-grained and pixel-wise informative features, to improve the classification capability of our model. We further put uncertainty estimation on channels of the feature map to guide pixel attention learning, such that the features from channels with higher uncertainty are considered unreliable and are thus restrictively exploited by pixel attention learning. To estimate channel-wise uncertainty, we introduce the Bayesian channel attention, which reformulates the traditional channel attention under the Bayesian framework. As a result, it denotes channel uncertainties with estimated variances that guide the pixel attention learning. We compared to several state-of-the-art baselines on two public datasets using parasite-level and patient-level evaluations. The proposed method demonstrates superior performance with respect to most metrics on two datasets, especially achieving highest average precision (AP) scores in both parasite and patient-level scenarios.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"191 ","pages":"Article 107833"},"PeriodicalIF":6.0,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Periodic-noise-tolerant neurodynamic approach for kWTA operation applied to opinions evolution 基于周期噪声容忍神经动力学的kWTA操作意见演化方法
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-07-08 DOI: 10.1016/j.neunet.2025.107839
Jiexing Li , Yongji Guan , Tiantai Deng , Long Jin
{"title":"Periodic-noise-tolerant neurodynamic approach for kWTA operation applied to opinions evolution","authors":"Jiexing Li ,&nbsp;Yongji Guan ,&nbsp;Tiantai Deng ,&nbsp;Long Jin","doi":"10.1016/j.neunet.2025.107839","DOIUrl":"10.1016/j.neunet.2025.107839","url":null,"abstract":"<div><div>For the <span><math><mi>k</mi></math></span>-winners-take-all (<span><math><mi>k</mi></math></span>WTA) operation, several anti-noise neurodynamic approaches have been investigated to counteract various types of disturbances and uncertainties. However, these approaches still fail to effectively address periodic noise originating from external environmental interference, sensor inaccuracies, or internal system oscillations. To address this issue, a periodic-noise-tolerant neurodynamic (PNTND) approach for <span><math><mi>k</mi></math></span>WTA operation is proposed, which exhibits the capability to learn and compensate for errors induced by periodic noise. Additionally, the PNTND approach effectively eliminates interference caused by the aperiodic noise originating from the superposition of periodic noises. Theoretical analyses and numerical simulations reveal the excellent convergence performance of the PNTND approach. Moreover, we construct a social opinion evolution model that incorporates periodic noise interference based on the proposed PNTND approach, thereby demonstrating its practical applicability.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"191 ","pages":"Article 107839"},"PeriodicalIF":6.0,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144614300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A confidence-guided Unsupervised domain adaptation network with pseudo-labeling and deformable CNN-transformer for medical image segmentation 基于伪标记和可变形CNN-transformer的医学图像分割置信引导无监督域自适应网络
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-07-08 DOI: 10.1016/j.neunet.2025.107844
Jiwen Zhou , Yue Xu , Zinan Liu , Fabien Pfaender , Wanyu Liu
{"title":"A confidence-guided Unsupervised domain adaptation network with pseudo-labeling and deformable CNN-transformer for medical image segmentation","authors":"Jiwen Zhou ,&nbsp;Yue Xu ,&nbsp;Zinan Liu ,&nbsp;Fabien Pfaender ,&nbsp;Wanyu Liu","doi":"10.1016/j.neunet.2025.107844","DOIUrl":"10.1016/j.neunet.2025.107844","url":null,"abstract":"<div><div>Unsupervised domain adaptation (UDA) methods have achieved significant progress in medical image segmentation. Nevertheless, the significant differences between the source and target domains remain a daunting barrier, creating an urgent need for more robust cross-domain solutions. Current UDA techniques generally employ a fixed, unvarying feature alignment procedure to reduce inter-domain differences throughout the training process. This rigidity disregards the shifting nature of feature distributions throughout the training process, leading to suboptimal performance in boundary delineation and detail retention on the target domain. A novel confidence-guided unsupervised domain adaptation network (CUDA-Net) is introduced to overcome persistent domain gaps, adapt to shifting feature distributions during training, and enhance boundary delineation in the target domain. This proposed network adaptively aligns features by tracking cross-domain distribution shifts throughout training, starting with adversarial alignment at early stages (coarse) and transitioning to pseudo-label-driven alignment at later stages (fine-grained), thereby leading to more accurate segmentation in the target domain. A confidence-weighted mechanism then refines these pseudo labels by prioritizing high-confidence regions while allowing low-confidence areas to be gradually explored, thereby enhancing both label reliability and overall model stability. Experiments on three representative medical image datasets, namely MMWHS17, BraTS2021, and VS-Seg, confirm the superiority of CUDA-Net. Notably, CUDA-Net outperforms eight leading methods in terms of overall segmentation accuracy (Dice) and boundary extraction precision (ASD), highlighting that it offers an efficient and reliable solution for cross-domain medical image segmentation.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"191 ","pages":"Article 107844"},"PeriodicalIF":6.0,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144597470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tensor wheel completion for visual data with sparsity and smoothness on latent space 基于隐空间稀疏性和平滑性的可视化数据张量轮补全
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-07-08 DOI: 10.1016/j.neunet.2025.107713
Jinshi Yu, Yuan Xie, Ge Ma, Xiaomeng Li
{"title":"Tensor wheel completion for visual data with sparsity and smoothness on latent space","authors":"Jinshi Yu,&nbsp;Yuan Xie,&nbsp;Ge Ma,&nbsp;Xiaomeng Li","doi":"10.1016/j.neunet.2025.107713","DOIUrl":"10.1016/j.neunet.2025.107713","url":null,"abstract":"<div><div>Tensor wheel decomposition has recently drawn lots of attentions in tensor completion, due to its advantages of wheel topology in exploring the intrinsic relationships. However, since the rank of tensor wheel is defined as a vector, it is very hard to select one rather-good rank for tensor completion when the model is rank-sensitive, i.e., the model is prone to overfitting due to rank selection. To solve this problem, under the tensor wheel structure, we theoretically analyze the relationship of sparsity and smoothness to the overfitting, which is expected to improve the performance by preventing the overfitting due to excessive rank selection. Then, based on the analysis of sparsity and smoothness, we proposed a novel tensor wheel completion model with sparsity and smoothness on latent space. Lastly, an efficient alternating direction method of multipliers (ADMM)-based algorithm is developed to optimize the proposed model. Experimental results show that the proposed method is superior to some existing methods in tensor completion and can maintain good results in a large range of rank selection, which enable the proposed method is not easy to overfit with the increasing of rank.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"192 ","pages":"Article 107713"},"PeriodicalIF":6.0,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144656857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multimodal prediction of catheter ablation outcomes in patients with persistent atrial fibrillation 持续性房颤患者导管消融结局的多模式预测
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-07-08 DOI: 10.1016/j.neunet.2025.107835
Zhan Zhou , Yanjie Chen , Fengxiang Zhang , Hanjie Liu , Hamid Reza Karimi , Jinde Cao
{"title":"Multimodal prediction of catheter ablation outcomes in patients with persistent atrial fibrillation","authors":"Zhan Zhou ,&nbsp;Yanjie Chen ,&nbsp;Fengxiang Zhang ,&nbsp;Hanjie Liu ,&nbsp;Hamid Reza Karimi ,&nbsp;Jinde Cao","doi":"10.1016/j.neunet.2025.107835","DOIUrl":"10.1016/j.neunet.2025.107835","url":null,"abstract":"<div><div>The recurrence of atrial fibrillation (AF) following catheter ablation is a common complication in patients with persistent atrial fibrillation (psAF), increasing the risk of stroke and heart failure thereafter. Given the multifactorial nature of post-ablation AF, clinical predictions of successful ablation often suffer from poor accuracy and lack robustness. This paper proposes a multimodal prediction model for post-ablation AF, which extracts complex features from multidimensional data, including electrocardiogram (ECG) images, cellular characteristics, intraoperative and demographic information of patients. Specifically, a dual-module structure is proposed for ECG processing. It consists of an image module that extracts spatial features and a temporal module that captures sequential features, effectively capturing the spatiotemporal dynamics of ECG. A clinical-intraoperative data integration module is developed to combat the complex nature of cellular, demographic, and intraoperative data structures in clinical settings by leveraging sparse and dense feature integration, enabling effective representation and processing. Finally, a feature fusion module is introduced, composed of a dynamic weight mechanism and a multimodal Transformer model, enhancing feature interaction and facilitating effective information synchrony between different modalities. The experimental results demonstrate that the proposed model achieved an accuracy of 0.9079 and an Area Under the Curve (AUC) of 0.8690. These findings highlight significant effectiveness in post-ablation AF prediction, offering a comprehensive prediction framework that supports early intervention for patients with psAF at risk for AF recurrence.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"191 ","pages":"Article 107835"},"PeriodicalIF":6.0,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144614390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adversarial guided diffusion models for adversarial purification 对抗净化的对抗导向扩散模型
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-07-08 DOI: 10.1016/j.neunet.2025.107705
Guang Lin , Zerui Tao , Jianhai Zhang , Toshihisa Tanaka , Qibin Zhao
{"title":"Adversarial guided diffusion models for adversarial purification","authors":"Guang Lin ,&nbsp;Zerui Tao ,&nbsp;Jianhai Zhang ,&nbsp;Toshihisa Tanaka ,&nbsp;Qibin Zhao","doi":"10.1016/j.neunet.2025.107705","DOIUrl":"10.1016/j.neunet.2025.107705","url":null,"abstract":"<div><div>Diffusion model (DM) based adversarial purification (AP) has proven to be a powerful defense method that can remove adversarial perturbations and generate a purified example without threats. In principle, the pre-trained DMs can only ensure that purified examples conform to the same distribution of the training data, but it may inadvertently compromise the semantic information of input examples, leading to misclassification of purified examples. Recent advancements introduce guided diffusion techniques to preserve semantic information while removing the perturbations. However, these guidances often rely on distance measures between purified examples and diffused examples, which can also preserve perturbations in purified examples. To further unleash the robustness power of DM-based AP, we propose an adversarial guided diffusion model by introducing a novel adversarial guidance that contains sufficient semantic information but does not explicitly involve adversarial perturbations. The guidance is modeled by an auxiliary neural network obtained with adversarial training, considering the distance in the latent representations rather than at the pixel-level values. Extensive experiments are conducted on CIFAR-10, CIFAR-100 and ImageNet to demonstrate that our method is effective for simultaneously maintaining semantic information and removing the adversarial perturbations. In addition, comprehensive comparisons show that our method significantly enhances the robustness of existing DM-based AP, with an average robust accuracy improved by up to 7.30% on CIFAR-10.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"191 ","pages":"Article 107705"},"PeriodicalIF":6.0,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144588862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ShiftKD: Benchmarking knowledge distillation under distribution shift ShiftKD:分布转移下的基准知识蒸馏
IF 6 1区 计算机科学
Neural Networks Pub Date : 2025-07-08 DOI: 10.1016/j.neunet.2025.107838
Songming Zhang , Yuxiao Luo , Ziyu Lyu , Xiaofeng Chen
{"title":"ShiftKD: Benchmarking knowledge distillation under distribution shift","authors":"Songming Zhang ,&nbsp;Yuxiao Luo ,&nbsp;Ziyu Lyu ,&nbsp;Xiaofeng Chen","doi":"10.1016/j.neunet.2025.107838","DOIUrl":"10.1016/j.neunet.2025.107838","url":null,"abstract":"<div><div>Knowledge Distillation (KD) transfers knowledge from large models to small models and has recently achieved remarkable success. However, the reliability of existing KD methods in real-world applications, especially under distribution shift, remains underexplored. Distribution shift refers to the data distribution drifts between the training and testing phases, and this can adversely affect the efficacy of KD. In this paper, we propose a unified and systematic framework <span>ShiftKD</span> to benchmark KD against two general distributional shifts: diversity and correlation shift. The evaluation benchmark covers more than 30 methods from algorithmic, data-driven, and optimization perspectives for five benchmark datasets. Our development of <span>ShiftKD</span> conducts extensive experiments and reveals strengths and limitations of current SOTA KD methods. More importantly, we thoroughly analyze key factors in student model training process, including data augmentation, pruning methods, optimizers, and evaluation metrics. We believe <span>ShiftKD</span> could serve as an effective benchmark for assessing KD in real-world scenarios, thus driving the development of more robust KD methods in response to evolving demands. The code will be made available upon publication.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"192 ","pages":"Article 107838"},"PeriodicalIF":6.0,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144704463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信