Optimization of deep learning models: benchmark and analysis

Rasheed Ahmad, Izzat Alsmadi, Mohammad Al-Ramahi
{"title":"Optimization of deep learning models: benchmark and analysis","authors":"Rasheed Ahmad,&nbsp;Izzat Alsmadi,&nbsp;Mohammad Al-Ramahi","doi":"10.1007/s43674-023-00055-1","DOIUrl":null,"url":null,"abstract":"<div><p>Model optimization in deep learning (DL) and neural networks is concerned about how and why the model can be successfully trained towards one or more objective functions. The evolutionary learning or training process continuously considers the dynamic parameters of the model. Many researchers propose a deep learning-based solution by randomly selecting a single classifier model architecture. Such approaches generally overlook the hidden and complex nature of the model’s internal working, producing biased results. Larger and deeper NN models bring many complexities and logistic challenges while building and deploying them. To obtain high-quality performance results, an optimal model generally depends on the appropriate architectural settings, such as the number of hidden layers and the number of neurons at each layer. A challenging and time-consuming task is to select and test various combinations of these settings manually. This paper presents an extensive empirical analysis of various deep learning algorithms trained recursively using permutated settings to establish benchmarks and find an optimal model. The paper analyzed the Stack Overflow dataset to predict the quality of posted questions. The extensive empirical analysis revealed that some famous deep learning algorithms such as CNN are the least effective algorithm in solving this problem compared to multilayer perceptron (MLP), which provides efficient computing and the best results in terms of prediction accuracy. The analysis also shows that manipulating the number of neurons alone at each layer in a network does not influence model optimization. This paper’s findings will help to recognize the fact that future models should be built by considering a vast range of model architectural settings for an optimal solution.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"3 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in computational intelligence","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s43674-023-00055-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Model optimization in deep learning (DL) and neural networks is concerned about how and why the model can be successfully trained towards one or more objective functions. The evolutionary learning or training process continuously considers the dynamic parameters of the model. Many researchers propose a deep learning-based solution by randomly selecting a single classifier model architecture. Such approaches generally overlook the hidden and complex nature of the model’s internal working, producing biased results. Larger and deeper NN models bring many complexities and logistic challenges while building and deploying them. To obtain high-quality performance results, an optimal model generally depends on the appropriate architectural settings, such as the number of hidden layers and the number of neurons at each layer. A challenging and time-consuming task is to select and test various combinations of these settings manually. This paper presents an extensive empirical analysis of various deep learning algorithms trained recursively using permutated settings to establish benchmarks and find an optimal model. The paper analyzed the Stack Overflow dataset to predict the quality of posted questions. The extensive empirical analysis revealed that some famous deep learning algorithms such as CNN are the least effective algorithm in solving this problem compared to multilayer perceptron (MLP), which provides efficient computing and the best results in terms of prediction accuracy. The analysis also shows that manipulating the number of neurons alone at each layer in a network does not influence model optimization. This paper’s findings will help to recognize the fact that future models should be built by considering a vast range of model architectural settings for an optimal solution.

深度学习模型的优化:基准和分析
深度学习(DL)和神经网络中的模型优化关注如何以及为什么可以针对一个或多个目标函数成功地训练模型。进化学习或训练过程不断地考虑模型的动态参数。许多研究人员通过随机选择单个分类器模型架构,提出了一种基于深度学习的解决方案。这种方法通常忽略了模型内部工作的隐蔽性和复杂性,从而产生有偏见的结果。更大、更深层次的神经网络模型在构建和部署时带来了许多复杂性和后勤挑战。为了获得高质量的性能结果,最佳模型通常取决于适当的架构设置,例如隐藏层的数量和每层神经元的数量。手动选择和测试这些设置的各种组合是一项具有挑战性且耗时的任务。本文对使用置换设置递归训练的各种深度学习算法进行了广泛的实证分析,以建立基准并找到最优模型。本文分析了Stack Overflow数据集来预测发布问题的质量。广泛的实证分析表明,与多层感知器(MLP)相比,一些著名的深度学习算法(如CNN)是解决这一问题的最不有效的算法,多层感知机提供了高效的计算和预测精度方面的最佳结果。分析还表明,单独操纵网络中每一层的神经元数量不会影响模型优化。本文的发现将有助于认识到这样一个事实,即未来的模型应该通过考虑广泛的模型体系结构设置来构建最佳解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信