Learning Neural Network Architectures using Backpropagation

Suraj Srinivas, R. Venkatesh Babu
{"title":"Learning Neural Network Architectures using Backpropagation","authors":"Suraj Srinivas, R. Venkatesh Babu","doi":"10.5244/C.30.104","DOIUrl":null,"url":null,"abstract":"Deep neural networks with millions of parameters are at the heart of many state of the art machine learning models today. However, recent works have shown that models with much smaller number of parameters can also perform just as well. In this work, we introduce the problem of architecture-learning, i.e; learning the architecture of a neural network along with weights. We introduce a new trainable parameter called tri-state ReLU, which helps in eliminating unnecessary neurons. We also propose a smooth regularizer which encourages the total number of neurons after elimination to be small. The resulting objective is differentiable and simple to optimize. We experimentally validate our method on both small and large networks, and show that it can learn models with a considerably small number of parameters without affecting prediction accuracy.","PeriodicalId":125761,"journal":{"name":"Procedings of the British Machine Vision Conference 2016","volume":"449 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Procedings of the British Machine Vision Conference 2016","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5244/C.30.104","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27

Abstract

Deep neural networks with millions of parameters are at the heart of many state of the art machine learning models today. However, recent works have shown that models with much smaller number of parameters can also perform just as well. In this work, we introduce the problem of architecture-learning, i.e; learning the architecture of a neural network along with weights. We introduce a new trainable parameter called tri-state ReLU, which helps in eliminating unnecessary neurons. We also propose a smooth regularizer which encourages the total number of neurons after elimination to be small. The resulting objective is differentiable and simple to optimize. We experimentally validate our method on both small and large networks, and show that it can learn models with a considerably small number of parameters without affecting prediction accuracy.
使用反向传播学习神经网络架构
拥有数百万个参数的深度神经网络是当今许多最先进的机器学习模型的核心。然而,最近的研究表明,参数数量少得多的模型也可以表现得很好。在这项工作中,我们引入了建筑学习的问题,即;学习神经网络的结构和权重。我们引入了一个新的可训练参数,称为三状态ReLU,它有助于消除不必要的神经元。我们还提出了一个平滑正则器,它鼓励消除后的神经元总数变小。得到的目标是可微的,易于优化。我们在小型和大型网络上实验验证了我们的方法,并表明它可以在不影响预测精度的情况下学习具有相当少参数的模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信