A hybrid-model optimization algorithm based on the Gaussian process and particle swarm optimization for mixed-variable CNN hyperparameter automatic search
Han Yan, Chongquan Zhong, Yuhu Wu, Liyong Zhang, Wei Lu
{"title":"A hybrid-model optimization algorithm based on the Gaussian process and particle swarm optimization for mixed-variable CNN hyperparameter automatic search","authors":"Han Yan, Chongquan Zhong, Yuhu Wu, Liyong Zhang, Wei Lu","doi":"10.1631/fitee.2200515","DOIUrl":null,"url":null,"abstract":": Convolutional neural networks (CNNs) have developed quickly in many real-word fields. However, a CNN’s performance heavily depends on its hyperparameters, and finding suitable hyperparameters for CNNs working in application fields is challenging for three reasons: (1) the problem of mixed-variable encoding for different types hyperparameters in CNNs; (2) expensive computational costs in evaluating candidate hyperparameters configuration; and (3) the problem of ensuring convergence rates and model performance during hyperparameters searching. To overcome these problems and challenges, a hybrid-model optimization algorithm (OA) is proposed in this paper to search suitable hyperparameters configurations automatically based on the Gaussian process (GP) and the particle swarm optimization (PSO) algorithm (GPPSO). First, a new encoding method is designed to efficiently deal with the CNN hyperparameter mixed-variable problem. Second, a hybrid-surrogate-assisted (HSA) model is proposed to reduce the high cost of evaluating candidate hyperparameters configuration. Third, a novel activate function is suggested to improve the model performance and ensure the convergence rate. Intensive experiments are performed on image-classification benchmark datasets to demonstrate the superior performance of GPPSO over the state-of-the-art (SOTA) methods. Moreover, a case study on metal fracture (MF) diagnosis is carried out to evaluate the GPPSO algorithm performance in practical applications. Experiment results demonstrate the effectiveness and efficiency of GPPSO, which achieves accuracy of 95.26% and 76.36% only through 0.04 and 1.70 GPU days on the CIFAR10 and CIFAR100 datasets, respectively.","PeriodicalId":430963,"journal":{"name":"Frontiers of Information Technology & Electronic Engineering","volume":"137 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers of Information Technology & Electronic Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1631/fitee.2200515","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
: Convolutional neural networks (CNNs) have developed quickly in many real-word fields. However, a CNN’s performance heavily depends on its hyperparameters, and finding suitable hyperparameters for CNNs working in application fields is challenging for three reasons: (1) the problem of mixed-variable encoding for different types hyperparameters in CNNs; (2) expensive computational costs in evaluating candidate hyperparameters configuration; and (3) the problem of ensuring convergence rates and model performance during hyperparameters searching. To overcome these problems and challenges, a hybrid-model optimization algorithm (OA) is proposed in this paper to search suitable hyperparameters configurations automatically based on the Gaussian process (GP) and the particle swarm optimization (PSO) algorithm (GPPSO). First, a new encoding method is designed to efficiently deal with the CNN hyperparameter mixed-variable problem. Second, a hybrid-surrogate-assisted (HSA) model is proposed to reduce the high cost of evaluating candidate hyperparameters configuration. Third, a novel activate function is suggested to improve the model performance and ensure the convergence rate. Intensive experiments are performed on image-classification benchmark datasets to demonstrate the superior performance of GPPSO over the state-of-the-art (SOTA) methods. Moreover, a case study on metal fracture (MF) diagnosis is carried out to evaluate the GPPSO algorithm performance in practical applications. Experiment results demonstrate the effectiveness and efficiency of GPPSO, which achieves accuracy of 95.26% and 76.36% only through 0.04 and 1.70 GPU days on the CIFAR10 and CIFAR100 datasets, respectively.