马尔可夫链蒙特卡罗采样深度贝叶斯神经网络的分层缩放高斯先验。

IF 3 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Frontiers in Artificial Intelligence Pub Date : 2025-04-25 eCollection Date: 2025-01-01 DOI:10.3389/frai.2025.1444891

Devesh Jawla, John Kelleher

{"title":"马尔可夫链蒙特卡罗采样深度贝叶斯神经网络的分层缩放高斯先验。","authors":"Devesh Jawla, John Kelleher","doi":"10.3389/frai.2025.1444891","DOIUrl":null,"url":null,"abstract":"Previous work has demonstrated that initialization is very important for both fitting a neural network by gradient descent methods, as well as for Variational inference of Bayesian neural networks. In this work we investigate how Layer wise Scaled Gaussian Priors perform with Markov Chain Monte Carlo trained Bayesian neural networks. From our experiments on 8 classifications datasets of various complexity, the results indicate that using Layer wise Scaled Gaussian Priors makes the sampling process more efficient as compared to using an Isotropic Gaussian Prior, an Isotropic Cauchy Prior, or an Isotropic Laplace Prior. We also show that the cold posterior effect does not arise when using a either an Isotropic Gaussian or a layer wise Scaled Prior for small feed forward Bayesian neural networks. Since Bayesian neural networks are becoming popular due to their advantages such as uncertainty estimation, and prevention of over-fitting, this work seeks to provide improvements in the efficiency of Bayesian neural networks learned using Markov Chain Monte Carlo methods.","PeriodicalId":33315,"journal":{"name":"Frontiers in Artificial Intelligence","volume":"8 ","pages":"1444891"},"PeriodicalIF":3.0000,"publicationDate":"2025-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12061901/pdf/","citationCount":"0","resultStr":"{\"title\":\"Layer wise Scaled Gaussian Priors for Markov Chain Monte Carlo Sampled deep Bayesian neural networks.\",\"authors\":\"Devesh Jawla, John Kelleher\",\"doi\":\"10.3389/frai.2025.1444891\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Previous work has demonstrated that initialization is very important for both fitting a neural network by gradient descent methods, as well as for Variational inference of Bayesian neural networks. In this work we investigate how Layer wise Scaled Gaussian Priors perform with Markov Chain Monte Carlo trained Bayesian neural networks. From our experiments on 8 classifications datasets of various complexity, the results indicate that using Layer wise Scaled Gaussian Priors makes the sampling process more efficient as compared to using an Isotropic Gaussian Prior, an Isotropic Cauchy Prior, or an Isotropic Laplace Prior. We also show that the cold posterior effect does not arise when using a either an Isotropic Gaussian or a layer wise Scaled Prior for small feed forward Bayesian neural networks. Since Bayesian neural networks are becoming popular due to their advantages such as uncertainty estimation, and prevention of over-fitting, this work seeks to provide improvements in the efficiency of Bayesian neural networks learned using Markov Chain Monte Carlo methods.\",\"PeriodicalId\":33315,\"journal\":{\"name\":\"Frontiers in Artificial Intelligence\",\"volume\":\"8 \",\"pages\":\"1444891\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2025-04-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12061901/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Frontiers in Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3389/frai.2025.1444891\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/frai.2025.1444891","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

先前的工作已经证明，初始化对于通过梯度下降方法拟合神经网络以及贝叶斯神经网络的变分推理都非常重要。在这项工作中，我们研究了分层缩放高斯先验如何与马尔可夫链蒙特卡罗训练的贝叶斯神经网络一起执行。从我们对8个不同复杂性的分类数据集的实验中，结果表明，与使用各向同性高斯先验、各向同性柯西先验或各向同性拉普拉斯先验相比，使用分层缩放高斯先验使采样过程更有效。我们还表明，当对小前馈贝叶斯神经网络使用各向同性高斯或分层缩放先验时，不会产生冷后验效应。由于贝叶斯神经网络由于其不确定性估计和防止过度拟合等优点而变得流行，本工作旨在提高使用马尔可夫链蒙特卡罗方法学习的贝叶斯神经网络的效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Layer wise Scaled Gaussian Priors for Markov Chain Monte Carlo Sampled deep Bayesian neural networks.

Previous work has demonstrated that initialization is very important for both fitting a neural network by gradient descent methods, as well as for Variational inference of Bayesian neural networks. In this work we investigate how Layer wise Scaled Gaussian Priors perform with Markov Chain Monte Carlo trained Bayesian neural networks. From our experiments on 8 classifications datasets of various complexity, the results indicate that using Layer wise Scaled Gaussian Priors makes the sampling process more efficient as compared to using an Isotropic Gaussian Prior, an Isotropic Cauchy Prior, or an Isotropic Laplace Prior. We also show that the cold posterior effect does not arise when using a either an Isotropic Gaussian or a layer wise Scaled Prior for small feed forward Bayesian neural networks. Since Bayesian neural networks are becoming popular due to their advantages such as uncertainty estimation, and prevention of over-fitting, this work seeks to provide improvements in the efficiency of Bayesian neural networks learned using Markov Chain Monte Carlo methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊