Extending the hypergradient descent technique to reduce the time of optimal solution achieved in hyperparameter optimization algorithms

IF 1.6 3区工程技术 Q4 ENGINEERING, INDUSTRIAL

International Journal of Industrial Engineering Computations Pub Date : 2023-01-01 DOI:10.5267/j.ijiec.2023.4.004

F. Seifi, S. T. A. Niaki

{"title":"Extending the hypergradient descent technique to reduce the time of optimal solution achieved in hyperparameter optimization algorithms","authors":"F. Seifi, S. T. A. Niaki","doi":"10.5267/j.ijiec.2023.4.004","DOIUrl":null,"url":null,"abstract":"There have been many applications for machine learning algorithms in different fields. The importance of hyperparameters for machine learning algorithms is their control over the behaviors of training algorithms and their crucial impact on the performance of machine learning models. Tuning hyperparameters crucially affects the performance of machine learning algorithms, and future advances in this area mainly depend on well-tuned hyperparameters. Nevertheless, the high computational cost involved in evaluating the algorithms in large datasets or complicated models is a significant limitation that causes inefficiency of the tuning process. Besides, increased online applications of machine learning approaches have led to the requirement of producing good answers in less time. The present study first presents a novel classification of hyperparameter types based on their types to create high-quality solutions quickly. Then, based on this classification and using the hypergradient technique, some hyperparameters of deep learning algorithms are adjusted during the training process to decrease the search space and discover the optimal values of the hyperparameters. This method just needs only the parameters of the previous two steps and the gradient of the previous step. Finally, the proposed method is combined with other techniques in hyperparameter optimization, and the results are reviewed in two case studies. As confirmed by experimental results, the performance of the algorithms with the proposed method have been increased 36.62% and 23.16% (based on the best average accuracy) for Cifar10 and Cifar100 dataset respectively in early stages while the final produced answers with this method are equal to or better than the algorithms without it. Therefore, this method can be combined with hyperparameter optimization algorithms in order to improve their performance and make them more appropriate for online use by just using the parameters of the previous two steps and the gradient of the previous step.","PeriodicalId":51356,"journal":{"name":"International Journal of Industrial Engineering Computations","volume":"os-40 1","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Industrial Engineering Computations","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.5267/j.ijiec.2023.4.004","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, INDUSTRIAL","Score":null,"Total":0}

引用次数: 0

Abstract

There have been many applications for machine learning algorithms in different fields. The importance of hyperparameters for machine learning algorithms is their control over the behaviors of training algorithms and their crucial impact on the performance of machine learning models. Tuning hyperparameters crucially affects the performance of machine learning algorithms, and future advances in this area mainly depend on well-tuned hyperparameters. Nevertheless, the high computational cost involved in evaluating the algorithms in large datasets or complicated models is a significant limitation that causes inefficiency of the tuning process. Besides, increased online applications of machine learning approaches have led to the requirement of producing good answers in less time. The present study first presents a novel classification of hyperparameter types based on their types to create high-quality solutions quickly. Then, based on this classification and using the hypergradient technique, some hyperparameters of deep learning algorithms are adjusted during the training process to decrease the search space and discover the optimal values of the hyperparameters. This method just needs only the parameters of the previous two steps and the gradient of the previous step. Finally, the proposed method is combined with other techniques in hyperparameter optimization, and the results are reviewed in two case studies. As confirmed by experimental results, the performance of the algorithms with the proposed method have been increased 36.62% and 23.16% (based on the best average accuracy) for Cifar10 and Cifar100 dataset respectively in early stages while the final produced answers with this method are equal to or better than the algorithms without it. Therefore, this method can be combined with hyperparameter optimization algorithms in order to improve their performance and make them more appropriate for online use by just using the parameters of the previous two steps and the gradient of the previous step.

查看原文本刊更多论文

扩展了超梯度下降技术，以减少超参数优化算法的最优解时间

机器学习算法在不同的领域有很多应用。超参数对于机器学习算法的重要性在于它们对训练算法行为的控制以及它们对机器学习模型性能的关键影响。超参数的调优对机器学习算法的性能有着至关重要的影响，该领域的未来发展主要依赖于超参数的调优。然而，在大型数据集或复杂模型中评估算法所涉及的高计算成本是导致调优过程效率低下的一个重要限制。此外，越来越多的机器学习方法的在线应用导致了在更短的时间内产生好的答案的要求。本研究首先提出了一种基于超参数类型的新分类方法，以快速生成高质量的解。然后，在此分类的基础上，利用超梯度技术，在训练过程中调整深度学习算法的一些超参数，以减小搜索空间，发现超参数的最优值。该方法只需要前两步的参数和前一步的梯度。最后，将该方法与其他超参数优化技术相结合，并通过两个实例对结果进行了回顾。实验结果证实，采用该方法的算法在Cifar10和Cifar100数据集的早期性能分别提高了36.62%和23.16%(基于最佳平均准确率)，最终生成的答案等于或优于未使用该方法的算法。因此，该方法可以与超参数优化算法相结合，仅利用前两步的参数和前一步的梯度，就可以提高超参数优化算法的性能，使其更适合在线使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊