批处理层规范化cnn和rnn的一种新的规范化层

Proceedings of the 6th International Conference on Advances in Artificial Intelligence Pub Date : 2022-09-19 DOI:10.1145/3571560.3571566

A. Ziaee, Erion cCano

{"title":"批处理层规范化cnn和rnn的一种新的规范化层","authors":"A. Ziaee, Erion cCano","doi":"10.1145/3571560.3571566","DOIUrl":null,"url":null,"abstract":"This study introduces a new normalization layer termed Batch Layer Normalization (BLN) to reduce the problem of internal covariate shift in deep neural network layers. As a combined version of batch and layer normalization, BLN adaptively puts appropriate weight on mini-batch and feature normalization based on the inverse size of mini-batches to normalize the input to a layer during the learning process. It also performs the exact computation with a minor change at inference times, using either mini-batch statistics or population statistics. The decision process to either use statistics of mini-batch or population gives BLN the ability to play a comprehensive role in the hyper-parameter optimization process of models. The key advantage of BLN is the support of the theoretical analysis of being independent of the input data, and its statistical configuration heavily depends on the task performed, the amount of training data, and the size of batches. Test results indicate the application potential of BLN and its faster convergence than batch normalization and layer normalization in both Convolutional and Recurrent Neural Networks. The code of the experiments is publicly available online.1","PeriodicalId":143909,"journal":{"name":"Proceedings of the 6th International Conference on Advances in Artificial Intelligence","volume":"72 6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Batch Layer Normalization A new normalization layer for CNNs and RNNs\",\"authors\":\"A. Ziaee, Erion cCano\",\"doi\":\"10.1145/3571560.3571566\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This study introduces a new normalization layer termed Batch Layer Normalization (BLN) to reduce the problem of internal covariate shift in deep neural network layers. As a combined version of batch and layer normalization, BLN adaptively puts appropriate weight on mini-batch and feature normalization based on the inverse size of mini-batches to normalize the input to a layer during the learning process. It also performs the exact computation with a minor change at inference times, using either mini-batch statistics or population statistics. The decision process to either use statistics of mini-batch or population gives BLN the ability to play a comprehensive role in the hyper-parameter optimization process of models. The key advantage of BLN is the support of the theoretical analysis of being independent of the input data, and its statistical configuration heavily depends on the task performed, the amount of training data, and the size of batches. Test results indicate the application potential of BLN and its faster convergence than batch normalization and layer normalization in both Convolutional and Recurrent Neural Networks. The code of the experiments is publicly available online.1\",\"PeriodicalId\":143909,\"journal\":{\"name\":\"Proceedings of the 6th International Conference on Advances in Artificial Intelligence\",\"volume\":\"72 6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 6th International Conference on Advances in Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3571560.3571566\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 6th International Conference on Advances in Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3571560.3571566","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

本研究引入了一种新的归一化层，称为批层归一化(Batch layer normalization, BLN)，以减少深度神经网络层内部协变量移位的问题。作为批处理归一化和层归一化的结合版本，BLN在学习过程中，根据小批的逆大小，自适应地赋予小批和特征归一化适当的权重，对某一层的输入进行归一化。它还使用mini-batch统计数据或总体统计数据，在推理时间进行微小的更改，从而执行精确的计算。选择小批量统计量还是总体统计量的决策过程，使BLN能够在模型的超参数优化过程中发挥全面的作用。BLN的关键优势是支持独立于输入数据的理论分析，其统计配置在很大程度上取决于所执行的任务、训练数据的数量和批次的大小。实验结果表明了BLN在卷积神经网络和循环神经网络中的应用潜力，其收敛速度快于批归一化和层归一化。实验代码在网上是公开的

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Batch Layer Normalization A new normalization layer for CNNs and RNNs

This study introduces a new normalization layer termed Batch Layer Normalization (BLN) to reduce the problem of internal covariate shift in deep neural network layers. As a combined version of batch and layer normalization, BLN adaptively puts appropriate weight on mini-batch and feature normalization based on the inverse size of mini-batches to normalize the input to a layer during the learning process. It also performs the exact computation with a minor change at inference times, using either mini-batch statistics or population statistics. The decision process to either use statistics of mini-batch or population gives BLN the ability to play a comprehensive role in the hyper-parameter optimization process of models. The key advantage of BLN is the support of the theoretical analysis of being independent of the input data, and its statistical configuration heavily depends on the task performed, the amount of training data, and the size of batches. Test results indicate the application potential of BLN and its faster convergence than batch normalization and layer normalization in both Convolutional and Recurrent Neural Networks. The code of the experiments is publicly available online.1

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 6th International Conference on Advances in Artificial Intelligence

自引率

0.00%

发文量