{"title":"利用知识转移进行神经网络结构优化,提高训练效率","authors":"M. Gavrilescu, Florin Leon","doi":"10.1109/ICSTCC55426.2022.9931898","DOIUrl":null,"url":null,"abstract":"Neural networks have demonstrated their potential for solving a wide range of real-world problems. While capable of learning correlations and relationships among complex datasets, neural network training and validation are often tedious and time-consuming, especially for large, high-dimensional datasets, given that sufficient models should be run through the validation pipeline so as to adequately cover the hyperparameter space. In order to address this problem, we propose a weight adjustment method for improving training times when searching for the optimal neural network architecture by incorporating a knowledge transfer step into the corresponding pipeline. This involves learning weight features from a trained model and using them to improve the training times of a subsequent one. Considering a sequence of candidate neural networks with different hyperparameter values, each model, once trained, serves as a source of knowledge for the next one, resulting in improved overall training times. We test our approach on dataset of various sizes, where we obtain reduced total training times, especially when small learning rates are used for finely-tuned convergence.","PeriodicalId":220845,"journal":{"name":"2022 26th International Conference on System Theory, Control and Computing (ICSTCC)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using Knowledge Transfer for Neural Network Architecture Optimization with Improved Training Efficiency\",\"authors\":\"M. Gavrilescu, Florin Leon\",\"doi\":\"10.1109/ICSTCC55426.2022.9931898\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Neural networks have demonstrated their potential for solving a wide range of real-world problems. While capable of learning correlations and relationships among complex datasets, neural network training and validation are often tedious and time-consuming, especially for large, high-dimensional datasets, given that sufficient models should be run through the validation pipeline so as to adequately cover the hyperparameter space. In order to address this problem, we propose a weight adjustment method for improving training times when searching for the optimal neural network architecture by incorporating a knowledge transfer step into the corresponding pipeline. This involves learning weight features from a trained model and using them to improve the training times of a subsequent one. Considering a sequence of candidate neural networks with different hyperparameter values, each model, once trained, serves as a source of knowledge for the next one, resulting in improved overall training times. We test our approach on dataset of various sizes, where we obtain reduced total training times, especially when small learning rates are used for finely-tuned convergence.\",\"PeriodicalId\":220845,\"journal\":{\"name\":\"2022 26th International Conference on System Theory, Control and Computing (ICSTCC)\",\"volume\":\"9 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 26th International Conference on System Theory, Control and Computing (ICSTCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSTCC55426.2022.9931898\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 26th International Conference on System Theory, Control and Computing (ICSTCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSTCC55426.2022.9931898","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using Knowledge Transfer for Neural Network Architecture Optimization with Improved Training Efficiency
Neural networks have demonstrated their potential for solving a wide range of real-world problems. While capable of learning correlations and relationships among complex datasets, neural network training and validation are often tedious and time-consuming, especially for large, high-dimensional datasets, given that sufficient models should be run through the validation pipeline so as to adequately cover the hyperparameter space. In order to address this problem, we propose a weight adjustment method for improving training times when searching for the optimal neural network architecture by incorporating a knowledge transfer step into the corresponding pipeline. This involves learning weight features from a trained model and using them to improve the training times of a subsequent one. Considering a sequence of candidate neural networks with different hyperparameter values, each model, once trained, serves as a source of knowledge for the next one, resulting in improved overall training times. We test our approach on dataset of various sizes, where we obtain reduced total training times, especially when small learning rates are used for finely-tuned convergence.