Robust Training of Neural Networks at Arbitrary Precision and Sparsity

Chengxi Ye, Grace Chu, Yanfeng Liu, Yichi Zhang, Lukasz Lew, Andrew Howard
{"title":"Robust Training of Neural Networks at Arbitrary Precision and Sparsity","authors":"Chengxi Ye, Grace Chu, Yanfeng Liu, Yichi Zhang, Lukasz Lew, Andrew Howard","doi":"arxiv-2409.09245","DOIUrl":null,"url":null,"abstract":"The discontinuous operations inherent in quantization and sparsification\nintroduce obstacles to backpropagation. This is particularly challenging when\ntraining deep neural networks in ultra-low precision and sparse regimes. We\npropose a novel, robust, and universal solution: a denoising affine transform\nthat stabilizes training under these challenging conditions. By formulating\nquantization and sparsification as perturbations during training, we derive a\nperturbation-resilient approach based on ridge regression. Our solution employs\na piecewise constant backbone model to ensure a performance lower bound and\nfeatures an inherent noise reduction mechanism to mitigate perturbation-induced\ncorruption. This formulation allows existing models to be trained at\narbitrarily low precision and sparsity levels with off-the-shelf recipes.\nFurthermore, our method provides a novel perspective on training temporal\nbinary neural networks, contributing to ongoing efforts to narrow the gap\nbetween artificial and biological neural networks.","PeriodicalId":501162,"journal":{"name":"arXiv - MATH - Numerical Analysis","volume":"82 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Numerical Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09245","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The discontinuous operations inherent in quantization and sparsification introduce obstacles to backpropagation. This is particularly challenging when training deep neural networks in ultra-low precision and sparse regimes. We propose a novel, robust, and universal solution: a denoising affine transform that stabilizes training under these challenging conditions. By formulating quantization and sparsification as perturbations during training, we derive a perturbation-resilient approach based on ridge regression. Our solution employs a piecewise constant backbone model to ensure a performance lower bound and features an inherent noise reduction mechanism to mitigate perturbation-induced corruption. This formulation allows existing models to be trained at arbitrarily low precision and sparsity levels with off-the-shelf recipes. Furthermore, our method provides a novel perspective on training temporal binary neural networks, contributing to ongoing efforts to narrow the gap between artificial and biological neural networks.
以任意精度和稀疏度进行神经网络的鲁棒性训练
量化和稀疏化固有的不连续操作给反向传播带来了障碍。在超低精度和稀疏状态下训练深度神经网络时,这尤其具有挑战性。我们提出了一种新颖、稳健和通用的解决方案:去噪仿射变换,它能在这些具有挑战性的条件下稳定训练。通过将量化和稀疏化表述为训练过程中的扰动,我们得出了一种基于脊回归的抗扰动方法。我们的解决方案采用片断常数骨干模型来确保性能下限,并具有内在降噪机制来减轻扰动引起的破坏。此外,我们的方法为时空二元神经网络的训练提供了一个新的视角,有助于缩小人工神经网络与生物神经网络之间的差距。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信