A Strategic Weight Refinement Maneuver for Convolutional Neural Networks

Patrick Sharma, Adarsh Karan Sharma, Dinesh Kumar, Anuraganand Sharma
{"title":"A Strategic Weight Refinement Maneuver for Convolutional Neural Networks","authors":"Patrick Sharma, Adarsh Karan Sharma, Dinesh Kumar, Anuraganand Sharma","doi":"10.1109/IJCNN52387.2021.9533359","DOIUrl":null,"url":null,"abstract":"Stochastic Gradient Descent algorithms (SGD) remain a popular optimizer for deep learning networks and have been increasingly used in applications involving large datasets producing promising results. SGD approximates the gradient on a small subset of training examples, randomly selected in every iteration during network training. This randomness leads to the selection of an inconsistent order of training examples resulting in ambiguous values to solve the cost function. This paper applies Guided Stochastic Gradient Descent (GSGD) - a variant of SGD in deep learning neural networks. GSGD minimizes the training loss and maximizes the classification accuracy by overcoming the inconsistent order of data examples in SGDs. It temporarily bypasses the inconsistent data instances during gradient computation and weight update, leading to better convergence at the rate of $O(\\frac{1}{\\rho T-})$. Previously, GSGD has only been used in the shallow learning networks like the logistic regression. We try to incorporate GSGD in deep learning neural networks like the Convolutional Neural Networks (CNNs) and evaluate the classification accuracy in comparison with the same networks trained with SGDs. We test our approach on benchmark image datasets. Our baseline results show GSGD leads to a better convergence rate and improves classification accuracy by up to 3% of standard CNNs.","PeriodicalId":396583,"journal":{"name":"2021 International Joint Conference on Neural Networks (IJCNN)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN52387.2021.9533359","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Stochastic Gradient Descent algorithms (SGD) remain a popular optimizer for deep learning networks and have been increasingly used in applications involving large datasets producing promising results. SGD approximates the gradient on a small subset of training examples, randomly selected in every iteration during network training. This randomness leads to the selection of an inconsistent order of training examples resulting in ambiguous values to solve the cost function. This paper applies Guided Stochastic Gradient Descent (GSGD) - a variant of SGD in deep learning neural networks. GSGD minimizes the training loss and maximizes the classification accuracy by overcoming the inconsistent order of data examples in SGDs. It temporarily bypasses the inconsistent data instances during gradient computation and weight update, leading to better convergence at the rate of $O(\frac{1}{\rho T-})$. Previously, GSGD has only been used in the shallow learning networks like the logistic regression. We try to incorporate GSGD in deep learning neural networks like the Convolutional Neural Networks (CNNs) and evaluate the classification accuracy in comparison with the same networks trained with SGDs. We test our approach on benchmark image datasets. Our baseline results show GSGD leads to a better convergence rate and improves classification accuracy by up to 3% of standard CNNs.
卷积神经网络的一种策略权重细化策略
随机梯度下降算法(SGD)仍然是一种流行的深度学习网络优化器,并且越来越多地用于涉及大型数据集的应用中,产生了有希望的结果。SGD在网络训练期间的每次迭代中随机选择的一小部分训练样本上近似梯度。这种随机性导致选择的训练示例顺序不一致,从而导致求解成本函数的值不明确。本文将引导随机梯度下降(GSGD)——SGD的一种变体——应用于深度学习神经网络。GSGD通过克服sgd中数据样本顺序不一致的问题,使训练损失最小化,分类精度最大化。它在梯度计算和权重更新期间暂时绕过不一致的数据实例,从而以$O(\frac{1}{\rho T-})$的速度实现更好的收敛。在此之前,GSGD仅用于逻辑回归等浅层学习网络。我们尝试将GSGD结合到卷积神经网络(cnn)等深度学习神经网络中,并与使用sgd训练的相同网络进行比较,评估分类精度。我们在基准图像数据集上测试了我们的方法。我们的基线结果表明,GSGD导致了更好的收敛速度,并将分类精度提高了3% of standard CNNs.
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信