A hybrid-model optimization algorithm based on the Gaussian process and particle swarm optimization for mixed-variable CNN hyperparameter automatic search

Frontiers of Information Technology & Electronic Engineering Pub Date : 2023-09-07 DOI:10.1631/fitee.2200515

Han Yan, Chongquan Zhong, Yuhu Wu, Liyong Zhang, Wei Lu

{"title":"A hybrid-model optimization algorithm based on the Gaussian process and particle swarm optimization for mixed-variable CNN hyperparameter automatic search","authors":"Han Yan, Chongquan Zhong, Yuhu Wu, Liyong Zhang, Wei Lu","doi":"10.1631/fitee.2200515","DOIUrl":null,"url":null,"abstract":": Convolutional neural networks (CNNs) have developed quickly in many real-word fields. However, a CNN’s performance heavily depends on its hyperparameters, and finding suitable hyperparameters for CNNs working in application fields is challenging for three reasons: (1) the problem of mixed-variable encoding for different types hyperparameters in CNNs; (2) expensive computational costs in evaluating candidate hyperparameters configuration; and (3) the problem of ensuring convergence rates and model performance during hyperparameters searching. To overcome these problems and challenges, a hybrid-model optimization algorithm (OA) is proposed in this paper to search suitable hyperparameters configurations automatically based on the Gaussian process (GP) and the particle swarm optimization (PSO) algorithm (GPPSO). First, a new encoding method is designed to efficiently deal with the CNN hyperparameter mixed-variable problem. Second, a hybrid-surrogate-assisted (HSA) model is proposed to reduce the high cost of evaluating candidate hyperparameters configuration. Third, a novel activate function is suggested to improve the model performance and ensure the convergence rate. Intensive experiments are performed on image-classification benchmark datasets to demonstrate the superior performance of GPPSO over the state-of-the-art (SOTA) methods. Moreover, a case study on metal fracture (MF) diagnosis is carried out to evaluate the GPPSO algorithm performance in practical applications. Experiment results demonstrate the effectiveness and efficiency of GPPSO, which achieves accuracy of 95.26% and 76.36% only through 0.04 and 1.70 GPU days on the CIFAR10 and CIFAR100 datasets, respectively.","PeriodicalId":430963,"journal":{"name":"Frontiers of Information Technology & Electronic Engineering","volume":"137 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers of Information Technology & Electronic Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1631/fitee.2200515","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

: Convolutional neural networks (CNNs) have developed quickly in many real-word fields. However, a CNN’s performance heavily depends on its hyperparameters, and finding suitable hyperparameters for CNNs working in application fields is challenging for three reasons: (1) the problem of mixed-variable encoding for different types hyperparameters in CNNs; (2) expensive computational costs in evaluating candidate hyperparameters configuration; and (3) the problem of ensuring convergence rates and model performance during hyperparameters searching. To overcome these problems and challenges, a hybrid-model optimization algorithm (OA) is proposed in this paper to search suitable hyperparameters configurations automatically based on the Gaussian process (GP) and the particle swarm optimization (PSO) algorithm (GPPSO). First, a new encoding method is designed to efficiently deal with the CNN hyperparameter mixed-variable problem. Second, a hybrid-surrogate-assisted (HSA) model is proposed to reduce the high cost of evaluating candidate hyperparameters configuration. Third, a novel activate function is suggested to improve the model performance and ensure the convergence rate. Intensive experiments are performed on image-classification benchmark datasets to demonstrate the superior performance of GPPSO over the state-of-the-art (SOTA) methods. Moreover, a case study on metal fracture (MF) diagnosis is carried out to evaluate the GPPSO algorithm performance in practical applications. Experiment results demonstrate the effectiveness and efficiency of GPPSO, which achieves accuracy of 95.26% and 76.36% only through 0.04 and 1.70 GPU days on the CIFAR10 and CIFAR100 datasets, respectively.

查看原文本刊更多论文

一种基于高斯过程和粒子群优化的混合模型优化算法用于混合变量CNN超参数自动搜索

卷积神经网络(cnn)在现实世界的许多领域发展迅速。然而，CNN的性能在很大程度上依赖于其超参数，为CNN在应用领域找到合适的超参数是一项挑战，原因有三:(1)CNN中不同类型超参数的混合变量编码问题;(2)候选超参数配置的计算成本高;(3)在超参数搜索过程中保证收敛速度和模型性能的问题。为了克服这些问题和挑战，本文提出了一种基于高斯过程(GP)和粒子群优化(PSO)算法(GPPSO)的混合模型优化算法(OA)来自动搜索合适的超参数配置。首先，设计了一种新的编码方法来有效地处理CNN超参数混合变量问题。其次，提出了一种混合代理辅助(HSA)模型，以降低评估候选超参数配置的高成本。第三，提出了一种新的激活函数，以提高模型的性能和保证收敛速度。在图像分类基准数据集上进行了大量实验，以证明GPPSO优于最先进(SOTA)方法的性能。并以金属断裂(MF)诊断为例，对GPPSO算法在实际应用中的性能进行了评价。实验结果证明了GPPSO算法的有效性和高效性，在CIFAR10和CIFAR100数据集上分别通过0.04和1.70 GPU天，准确率分别达到95.26%和76.36%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Frontiers of Information Technology & Electronic Engineering

自引率

0.00%

发文量