{"title":"Evaluating Dropout Placements in Bayesian Regression Resnet","authors":"Lei Shi, C. Copot, S. Vanlanduit","doi":"10.2478/jaiscr-2022-0005","DOIUrl":null,"url":null,"abstract":"Abstract Deep Neural Networks (DNNs) have shown great success in many fields. Various network architectures have been developed for different applications. Regardless of the complexities of the networks, DNNs do not provide model uncertainty. Bayesian Neural Networks (BNNs), on the other hand, is able to make probabilistic inference. Among various types of BNNs, Dropout as a Bayesian Approximation converts a Neural Network (NN) to a BNN by adding a dropout layer after each weight layer in the NN. This technique provides a simple transformation from a NN to a BNN. However, for DNNs, adding a dropout layer to each weight layer would lead to a strong regularization due to the deep architecture. Previous researches [1, 2, 3] have shown that adding a dropout layer after each weight layer in a DNN is unnecessary. However, how to place dropout layers in a ResNet for regression tasks are less explored. In this work, we perform an empirical study on how different dropout placements would affect the performance of a Bayesian DNN. We use a regression model modified from ResNet as the DNN and place the dropout layers at different places in the regression ResNet. Our experimental results show that it is not necessary to add a dropout layer after every weight layer in the Regression ResNet to let it be able to make Bayesian Inference. Placing Dropout layers between the stacked blocks i.e. Dense+Identity+Identity blocks has the best performance in Predictive Interval Coverage Probability (PICP). Placing a dropout layer after each stacked block has the best performance in Root Mean Square Error (RMSE).","PeriodicalId":48494,"journal":{"name":"Journal of Artificial Intelligence and Soft Computing Research","volume":"12 1","pages":"61 - 73"},"PeriodicalIF":3.3000,"publicationDate":"2021-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial Intelligence and Soft Computing Research","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.2478/jaiscr-2022-0005","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 1
Abstract
Abstract Deep Neural Networks (DNNs) have shown great success in many fields. Various network architectures have been developed for different applications. Regardless of the complexities of the networks, DNNs do not provide model uncertainty. Bayesian Neural Networks (BNNs), on the other hand, is able to make probabilistic inference. Among various types of BNNs, Dropout as a Bayesian Approximation converts a Neural Network (NN) to a BNN by adding a dropout layer after each weight layer in the NN. This technique provides a simple transformation from a NN to a BNN. However, for DNNs, adding a dropout layer to each weight layer would lead to a strong regularization due to the deep architecture. Previous researches [1, 2, 3] have shown that adding a dropout layer after each weight layer in a DNN is unnecessary. However, how to place dropout layers in a ResNet for regression tasks are less explored. In this work, we perform an empirical study on how different dropout placements would affect the performance of a Bayesian DNN. We use a regression model modified from ResNet as the DNN and place the dropout layers at different places in the regression ResNet. Our experimental results show that it is not necessary to add a dropout layer after every weight layer in the Regression ResNet to let it be able to make Bayesian Inference. Placing Dropout layers between the stacked blocks i.e. Dense+Identity+Identity blocks has the best performance in Predictive Interval Coverage Probability (PICP). Placing a dropout layer after each stacked block has the best performance in Root Mean Square Error (RMSE).
期刊介绍:
Journal of Artificial Intelligence and Soft Computing Research (available also at Sciendo (De Gruyter)) is a dynamically developing international journal focused on the latest scientific results and methods constituting traditional artificial intelligence methods and soft computing techniques. Our goal is to bring together scientists representing both approaches and various research communities.