Comparison of the stochastic gradient descent based optimization techniques

Ersan Yazan, M. F. Talu
{"title":"Comparison of the stochastic gradient descent based optimization techniques","authors":"Ersan Yazan, M. F. Talu","doi":"10.1109/IDAP.2017.8090299","DOIUrl":null,"url":null,"abstract":"The stochastic gradual descent method (SGD) is a popular optimization technique based on updating each θk parameter in the ∂J(θ)/∂θk direction to minimize / maximize the J(θ) cost function. This technique is frequently used in current artificial learning methods such as convolutional learning and automatic encoders. In this study, five different approaches (Momentum, Adagrad, Adadelta, Rmsprop ve Adam) based on SDA used in updating the θ parameters were investigated. By selecting specific test functions, the advantages and disadvantages of each approach are compared with each other in terms of the number of oscillations, the parameter update rate and the minimum cost reached. The comparison results are shown graphically.","PeriodicalId":111721,"journal":{"name":"2017 International Artificial Intelligence and Data Processing Symposium (IDAP)","volume":"208 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"54","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Artificial Intelligence and Data Processing Symposium (IDAP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IDAP.2017.8090299","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 54

Abstract

The stochastic gradual descent method (SGD) is a popular optimization technique based on updating each θk parameter in the ∂J(θ)/∂θk direction to minimize / maximize the J(θ) cost function. This technique is frequently used in current artificial learning methods such as convolutional learning and automatic encoders. In this study, five different approaches (Momentum, Adagrad, Adadelta, Rmsprop ve Adam) based on SDA used in updating the θ parameters were investigated. By selecting specific test functions, the advantages and disadvantages of each approach are compared with each other in terms of the number of oscillations, the parameter update rate and the minimum cost reached. The comparison results are shown graphically.
基于随机梯度下降优化技术的比较
随机渐进下降法(SGD)是一种流行的优化技术,它基于在∂J(θ)/∂θk方向上更新每个θk参数,以最小化/最大化J(θ)成本函数。该技术经常用于当前的人工学习方法,如卷积学习和自动编码器。本文研究了基于SDA的动量、Adagrad、Adadelta、Rmsprop和Adam五种不同方法在θ参数更新中的应用。通过选择特定的测试函数,比较了每种方法在振荡次数、参数更新率和达到的最小代价等方面的优缺点。对比结果用图形表示。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信