指数线性单元深度残差网络

Proceedings of the Third International Symposium on Computer Vision and the Internet Pub Date : 2016-04-14 DOI:10.1145/2983402.2983406

Anish Shah, Eashan Kadam, Hena Shah, Sameer Shinde

{"title":"指数线性单元深度残差网络","authors":"Anish Shah, Eashan Kadam, Hena Shah, Sameer Shinde","doi":"10.1145/2983402.2983406","DOIUrl":null,"url":null,"abstract":"The depth of convolutional neural networks is a crucial ingredient for reduction in test errors on benchmarks like ImageNet and COCO. However, training a neural network becomes difficult with increasing depth. Problems like vanishing gradient and diminishing feature reuse are quite trivial in very deep convolutional neural networks. The notable recent contributions towards solving these problems and simplifying the training of very deep models are Residual and Highway Networks. These networks allow earlier representations (from the input or those learned in earlier layers) to flow unimpededly to later layers through skip connections. Such very deep models with hundreds or more layers have lead to a considerable decrease in test errors, on benchmarks like ImageNet and COCO. In this paper, we propose to replace the combination of ReLU and Batch Normalization with Exponential Linear Unit (ELU) in Residual Networks. Our experiments show that this not only speeds up the learning behavior in Residual Networks, but also improves the classification performance as the depth increases. Our model increases the accuracy on datasets like CIFAR-10 and CIFAR-100 by a significant margin.","PeriodicalId":283626,"journal":{"name":"Proceedings of the Third International Symposium on Computer Vision and the Internet","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"103","resultStr":"{\"title\":\"Deep Residual Networks with Exponential Linear Unit\",\"authors\":\"Anish Shah, Eashan Kadam, Hena Shah, Sameer Shinde\",\"doi\":\"10.1145/2983402.2983406\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The depth of convolutional neural networks is a crucial ingredient for reduction in test errors on benchmarks like ImageNet and COCO. However, training a neural network becomes difficult with increasing depth. Problems like vanishing gradient and diminishing feature reuse are quite trivial in very deep convolutional neural networks. The notable recent contributions towards solving these problems and simplifying the training of very deep models are Residual and Highway Networks. These networks allow earlier representations (from the input or those learned in earlier layers) to flow unimpededly to later layers through skip connections. Such very deep models with hundreds or more layers have lead to a considerable decrease in test errors, on benchmarks like ImageNet and COCO. In this paper, we propose to replace the combination of ReLU and Batch Normalization with Exponential Linear Unit (ELU) in Residual Networks. Our experiments show that this not only speeds up the learning behavior in Residual Networks, but also improves the classification performance as the depth increases. Our model increases the accuracy on datasets like CIFAR-10 and CIFAR-100 by a significant margin.\",\"PeriodicalId\":283626,\"journal\":{\"name\":\"Proceedings of the Third International Symposium on Computer Vision and the Internet\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-04-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"103\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Third International Symposium on Computer Vision and the Internet\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2983402.2983406\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Third International Symposium on Computer Vision and the Internet","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2983402.2983406","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 103

摘要

卷积神经网络的深度是减少ImageNet和COCO等基准测试错误的关键因素。然而，随着深度的增加，训练神经网络变得越来越困难。在深度卷积神经网络中，梯度消失和特征重用递减等问题是非常微不足道的。最近对解决这些问题和简化深度模型训练的显著贡献是残差和高速公路网络。这些网络允许较早的表示(来自输入或在较早层中学习的表示)通过跳过连接畅通无阻地流向较晚的层。在像ImageNet和COCO这样的基准测试中，这种具有数百层或更多层的非常深的模型导致了测试错误的显著减少。在本文中，我们提出用指数线性单元(ELU)代替剩余网络中ReLU和批归一化的组合。我们的实验表明，这不仅加快了残差网络的学习行为，而且随着深度的增加，分类性能也有所提高。我们的模型大大提高了像CIFAR-10和CIFAR-100这样的数据集的准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Deep Residual Networks with Exponential Linear Unit

The depth of convolutional neural networks is a crucial ingredient for reduction in test errors on benchmarks like ImageNet and COCO. However, training a neural network becomes difficult with increasing depth. Problems like vanishing gradient and diminishing feature reuse are quite trivial in very deep convolutional neural networks. The notable recent contributions towards solving these problems and simplifying the training of very deep models are Residual and Highway Networks. These networks allow earlier representations (from the input or those learned in earlier layers) to flow unimpededly to later layers through skip connections. Such very deep models with hundreds or more layers have lead to a considerable decrease in test errors, on benchmarks like ImageNet and COCO. In this paper, we propose to replace the combination of ReLU and Batch Normalization with Exponential Linear Unit (ELU) in Residual Networks. Our experiments show that this not only speeds up the learning behavior in Residual Networks, but also improves the classification performance as the depth increases. Our model increases the accuracy on datasets like CIFAR-10 and CIFAR-100 by a significant margin.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Third International Symposium on Computer Vision and the Internet

自引率

0.00%

发文量