Comparison of the stochastic gradient descent based optimization techniques

2017 International Artificial Intelligence and Data Processing Symposium (IDAP) Pub Date : 2017-09-01 DOI:10.1109/IDAP.2017.8090299

Ersan Yazan, M. F. Talu

引用次数: 54

Abstract

The stochastic gradual descent method (SGD) is a popular optimization technique based on updating each θk parameter in the ∂J(θ)/∂θk direction to minimize / maximize the J(θ) cost function. This technique is frequently used in current artificial learning methods such as convolutional learning and automatic encoders. In this study, five different approaches (Momentum, Adagrad, Adadelta, Rmsprop ve Adam) based on SDA used in updating the θ parameters were investigated. By selecting specific test functions, the advantages and disadvantages of each approach are compared with each other in terms of the number of oscillations, the parameter update rate and the minimum cost reached. The comparison results are shown graphically.

查看原文本刊更多论文

基于随机梯度下降优化技术的比较

随机渐进下降法(SGD)是一种流行的优化技术，它基于在∂J(θ)/∂θk方向上更新每个θk参数，以最小化/最大化J(θ)成本函数。该技术经常用于当前的人工学习方法，如卷积学习和自动编码器。本文研究了基于SDA的动量、Adagrad、Adadelta、Rmsprop和Adam五种不同方法在θ参数更新中的应用。通过选择特定的测试函数，比较了每种方法在振荡次数、参数更新率和达到的最小代价等方面的优缺点。对比结果用图形表示。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 International Artificial Intelligence and Data Processing Symposium (IDAP)

自引率

0.00%

发文量