Changjian Deng, Jian Cheng, Yanzhou Su, Zeyu An, Zhiguo Yang, Ziying Xia, Yijie Zhang, Shiguang Wang
{"title":"广域拓扑:通过训练动态保存和广域拓扑探索来改进前瞻神经网络剪枝。","authors":"Changjian Deng, Jian Cheng, Yanzhou Su, Zeyu An, Zhiguo Yang, Ziying Xia, Yijie Zhang, Shiguang Wang","doi":"10.1016/j.neunet.2025.108136","DOIUrl":null,"url":null,"abstract":"<div><div>Foresight neural network pruning methods have garnered significant attention due to their potential to save computational resources. Recent advancements in this field are predominantly categorized into saliency score-based and graph theory-based methods. The former assesses the sensitivity of pruning parameter connections concerning specific metrics, while the latter aims to identify sub-networks characterized by sparse yet highly connected graph structures. However, recent research suggests that relying exclusively on saliency scores may result in deep but narrow sub-networks, while graph theory-based methods may be unsuitable for neural networks requiring pre-trained parameters for initialization, particularly in transfer learning scenarios. We hypothesize that preserving the training dynamics of sub-networks during pruning, along with exploring network structures with wide topology, can facilitate the identification of structurally stable sub-networks with improved post-training performance. Motivated by this, we propose WideTopo, which integrates Neural Tangent Kernel (NTK) theory with Implicit Target Alignment (ITA) in neural networks to capture the training dynamics of sub-networks. Furthermore, it employs a density-aware saliency score decay strategy and a repeated mask restoration strategy to retain more effective nodes, thereby sustaining the width of each layer within the sub-networks. We conducted extensive validations using CNN-based and ViT-based models on representative image classification and semantic segmentation datasets under both random and pre-trained initialization settings. The effectiveness and applicability of our method have been validated on diverse network architectures at various model density rates, showing competitive post-training performance compared with other existing baselines. Our code is publicly available at <span><span>https://github.com/Memoristor/WideTopo</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"194 ","pages":"Article 108136"},"PeriodicalIF":6.3000,"publicationDate":"2025-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"WideTopo: Improving foresight neural network pruning through training dynamics preservation and wide topologies exploration\",\"authors\":\"Changjian Deng, Jian Cheng, Yanzhou Su, Zeyu An, Zhiguo Yang, Ziying Xia, Yijie Zhang, Shiguang Wang\",\"doi\":\"10.1016/j.neunet.2025.108136\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Foresight neural network pruning methods have garnered significant attention due to their potential to save computational resources. Recent advancements in this field are predominantly categorized into saliency score-based and graph theory-based methods. The former assesses the sensitivity of pruning parameter connections concerning specific metrics, while the latter aims to identify sub-networks characterized by sparse yet highly connected graph structures. However, recent research suggests that relying exclusively on saliency scores may result in deep but narrow sub-networks, while graph theory-based methods may be unsuitable for neural networks requiring pre-trained parameters for initialization, particularly in transfer learning scenarios. We hypothesize that preserving the training dynamics of sub-networks during pruning, along with exploring network structures with wide topology, can facilitate the identification of structurally stable sub-networks with improved post-training performance. Motivated by this, we propose WideTopo, which integrates Neural Tangent Kernel (NTK) theory with Implicit Target Alignment (ITA) in neural networks to capture the training dynamics of sub-networks. Furthermore, it employs a density-aware saliency score decay strategy and a repeated mask restoration strategy to retain more effective nodes, thereby sustaining the width of each layer within the sub-networks. We conducted extensive validations using CNN-based and ViT-based models on representative image classification and semantic segmentation datasets under both random and pre-trained initialization settings. The effectiveness and applicability of our method have been validated on diverse network architectures at various model density rates, showing competitive post-training performance compared with other existing baselines. Our code is publicly available at <span><span>https://github.com/Memoristor/WideTopo</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":49763,\"journal\":{\"name\":\"Neural Networks\",\"volume\":\"194 \",\"pages\":\"Article 108136\"},\"PeriodicalIF\":6.3000,\"publicationDate\":\"2025-09-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neural Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0893608025010160\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025010160","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
WideTopo: Improving foresight neural network pruning through training dynamics preservation and wide topologies exploration
Foresight neural network pruning methods have garnered significant attention due to their potential to save computational resources. Recent advancements in this field are predominantly categorized into saliency score-based and graph theory-based methods. The former assesses the sensitivity of pruning parameter connections concerning specific metrics, while the latter aims to identify sub-networks characterized by sparse yet highly connected graph structures. However, recent research suggests that relying exclusively on saliency scores may result in deep but narrow sub-networks, while graph theory-based methods may be unsuitable for neural networks requiring pre-trained parameters for initialization, particularly in transfer learning scenarios. We hypothesize that preserving the training dynamics of sub-networks during pruning, along with exploring network structures with wide topology, can facilitate the identification of structurally stable sub-networks with improved post-training performance. Motivated by this, we propose WideTopo, which integrates Neural Tangent Kernel (NTK) theory with Implicit Target Alignment (ITA) in neural networks to capture the training dynamics of sub-networks. Furthermore, it employs a density-aware saliency score decay strategy and a repeated mask restoration strategy to retain more effective nodes, thereby sustaining the width of each layer within the sub-networks. We conducted extensive validations using CNN-based and ViT-based models on representative image classification and semantic segmentation datasets under both random and pre-trained initialization settings. The effectiveness and applicability of our method have been validated on diverse network architectures at various model density rates, showing competitive post-training performance compared with other existing baselines. Our code is publicly available at https://github.com/Memoristor/WideTopo.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.