Learning Neural Network Architectures using Backpropagation

Procedings of the British Machine Vision Conference 2016 Pub Date : 2015-11-17 DOI:10.5244/C.30.104

Suraj Srinivas, R. Venkatesh Babu

引用次数: 27

Abstract

Deep neural networks with millions of parameters are at the heart of many state of the art machine learning models today. However, recent works have shown that models with much smaller number of parameters can also perform just as well. In this work, we introduce the problem of architecture-learning, i.e; learning the architecture of a neural network along with weights. We introduce a new trainable parameter called tri-state ReLU, which helps in eliminating unnecessary neurons. We also propose a smooth regularizer which encourages the total number of neurons after elimination to be small. The resulting objective is differentiable and simple to optimize. We experimentally validate our method on both small and large networks, and show that it can learn models with a considerably small number of parameters without affecting prediction accuracy.

查看原文本刊更多论文

使用反向传播学习神经网络架构

拥有数百万个参数的深度神经网络是当今许多最先进的机器学习模型的核心。然而，最近的研究表明，参数数量少得多的模型也可以表现得很好。在这项工作中，我们引入了建筑学习的问题，即;学习神经网络的结构和权重。我们引入了一个新的可训练参数，称为三状态ReLU，它有助于消除不必要的神经元。我们还提出了一个平滑正则器，它鼓励消除后的神经元总数变小。得到的目标是可微的，易于优化。我们在小型和大型网络上实验验证了我们的方法，并表明它可以在不影响预测精度的情况下学习具有相当少参数的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Procedings of the British Machine Vision Conference 2016

自引率

0.00%

发文量