深度学习模型的优化：基准和分析

Advances in computational intelligence Pub Date : 2023-03-30 DOI:10.1007/s43674-023-00055-1

Rasheed Ahmad, Izzat Alsmadi, Mohammad Al-Ramahi

{"title":"深度学习模型的优化：基准和分析","authors":"Rasheed Ahmad, Izzat Alsmadi, Mohammad Al-Ramahi","doi":"10.1007/s43674-023-00055-1","DOIUrl":null,"url":null,"abstract":"<div><p>Model optimization in deep learning (DL) and neural networks is concerned about how and why the model can be successfully trained towards one or more objective functions. The evolutionary learning or training process continuously considers the dynamic parameters of the model. Many researchers propose a deep learning-based solution by randomly selecting a single classifier model architecture. Such approaches generally overlook the hidden and complex nature of the model’s internal working, producing biased results. Larger and deeper NN models bring many complexities and logistic challenges while building and deploying them. To obtain high-quality performance results, an optimal model generally depends on the appropriate architectural settings, such as the number of hidden layers and the number of neurons at each layer. A challenging and time-consuming task is to select and test various combinations of these settings manually. This paper presents an extensive empirical analysis of various deep learning algorithms trained recursively using permutated settings to establish benchmarks and find an optimal model. The paper analyzed the Stack Overflow dataset to predict the quality of posted questions. The extensive empirical analysis revealed that some famous deep learning algorithms such as CNN are the least effective algorithm in solving this problem compared to multilayer perceptron (MLP), which provides efficient computing and the best results in terms of prediction accuracy. The analysis also shows that manipulating the number of neurons alone at each layer in a network does not influence model optimization. This paper’s findings will help to recognize the fact that future models should be built by considering a vast range of model architectural settings for an optimal solution.</p></div>","PeriodicalId":72089,"journal":{"name":"Advances in computational intelligence","volume":"3 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Optimization of deep learning models: benchmark and analysis\",\"authors\":\"Rasheed Ahmad, Izzat Alsmadi, Mohammad Al-Ramahi\",\"doi\":\"10.1007/s43674-023-00055-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Model optimization in deep learning (DL) and neural networks is concerned about how and why the model can be successfully trained towards one or more objective functions. The evolutionary learning or training process continuously considers the dynamic parameters of the model. Many researchers propose a deep learning-based solution by randomly selecting a single classifier model architecture. Such approaches generally overlook the hidden and complex nature of the model’s internal working, producing biased results. Larger and deeper NN models bring many complexities and logistic challenges while building and deploying them. To obtain high-quality performance results, an optimal model generally depends on the appropriate architectural settings, such as the number of hidden layers and the number of neurons at each layer. A challenging and time-consuming task is to select and test various combinations of these settings manually. This paper presents an extensive empirical analysis of various deep learning algorithms trained recursively using permutated settings to establish benchmarks and find an optimal model. The paper analyzed the Stack Overflow dataset to predict the quality of posted questions. The extensive empirical analysis revealed that some famous deep learning algorithms such as CNN are the least effective algorithm in solving this problem compared to multilayer perceptron (MLP), which provides efficient computing and the best results in terms of prediction accuracy. The analysis also shows that manipulating the number of neurons alone at each layer in a network does not influence model optimization. This paper’s findings will help to recognize the fact that future models should be built by considering a vast range of model architectural settings for an optimal solution.</p></div>\",\"PeriodicalId\":72089,\"journal\":{\"name\":\"Advances in computational intelligence\",\"volume\":\"3 2\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-03-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Advances in computational intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s43674-023-00055-1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in computational intelligence","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s43674-023-00055-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

深度学习（DL）和神经网络中的模型优化关注如何以及为什么可以针对一个或多个目标函数成功地训练模型。进化学习或训练过程不断地考虑模型的动态参数。许多研究人员通过随机选择单个分类器模型架构，提出了一种基于深度学习的解决方案。这种方法通常忽略了模型内部工作的隐蔽性和复杂性，从而产生有偏见的结果。更大、更深层次的神经网络模型在构建和部署时带来了许多复杂性和后勤挑战。为了获得高质量的性能结果，最佳模型通常取决于适当的架构设置，例如隐藏层的数量和每层神经元的数量。手动选择和测试这些设置的各种组合是一项具有挑战性且耗时的任务。本文对使用置换设置递归训练的各种深度学习算法进行了广泛的实证分析，以建立基准并找到最优模型。本文分析了Stack Overflow数据集来预测发布问题的质量。广泛的实证分析表明，与多层感知器（MLP）相比，一些著名的深度学习算法（如CNN）是解决这一问题的最不有效的算法，多层感知机提供了高效的计算和预测精度方面的最佳结果。分析还表明，单独操纵网络中每一层的神经元数量不会影响模型优化。本文的发现将有助于认识到这样一个事实，即未来的模型应该通过考虑广泛的模型体系结构设置来构建最佳解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Optimization of deep learning models: benchmark and analysis

Model optimization in deep learning (DL) and neural networks is concerned about how and why the model can be successfully trained towards one or more objective functions. The evolutionary learning or training process continuously considers the dynamic parameters of the model. Many researchers propose a deep learning-based solution by randomly selecting a single classifier model architecture. Such approaches generally overlook the hidden and complex nature of the model’s internal working, producing biased results. Larger and deeper NN models bring many complexities and logistic challenges while building and deploying them. To obtain high-quality performance results, an optimal model generally depends on the appropriate architectural settings, such as the number of hidden layers and the number of neurons at each layer. A challenging and time-consuming task is to select and test various combinations of these settings manually. This paper presents an extensive empirical analysis of various deep learning algorithms trained recursively using permutated settings to establish benchmarks and find an optimal model. The paper analyzed the Stack Overflow dataset to predict the quality of posted questions. The extensive empirical analysis revealed that some famous deep learning algorithms such as CNN are the least effective algorithm in solving this problem compared to multilayer perceptron (MLP), which provides efficient computing and the best results in terms of prediction accuracy. The analysis also shows that manipulating the number of neurons alone at each layer in a network does not influence model optimization. This paper’s findings will help to recognize the fact that future models should be built by considering a vast range of model architectural settings for an optimal solution.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Advances in computational intelligence

自引率

0.00%

发文量