posteriordb: Testing, Benchmarking and Developing Bayesian Inference Algorithms

arXiv - STAT - Computation Pub Date : 2024-07-06 DOI:arxiv-2407.04967

Måns Magnusson, Jakob Torgander, Paul-Christian Bürkner, Lu Zhang, Bob Carpenter, Aki Vehtari

引用次数: 0

Abstract

The generality and robustness of inference algorithms is critical to the success of widely used probabilistic programming languages such as Stan, PyMC, Pyro, and Turing.jl. When designing a new general-purpose inference algorithm, whether it involves Monte Carlo sampling or variational approximation, the fundamental problem arises in evaluating its accuracy and efficiency across a range of representative target models. To solve this problem, we propose posteriordb, a database of models and data sets defining target densities along with reference Monte Carlo draws. We further provide a guide to the best practices in using posteriordb for model evaluation and comparison. To provide a wide range of realistic target densities, posteriordb currently comprises 120 representative models and has been instrumental in developing several general inference algorithms.

查看原文本刊更多论文

后ordb：测试、基准测试和开发贝叶斯推理算法

推理算法的通用性和鲁棒性是 Stan、PyMC、Pyro 和 Turing.jl 等广泛使用的概率编程语言取得成功的关键。在设计新的通用推理算法时，无论是蒙特卡罗抽样还是变分近似，最基本的问题是在一系列有代表性的目标模型中评估其准确性和效率。为了解决这个问题，我们提出了posteriordb，这是一个定义目标密度的模型和数据集数据库，并附有参考蒙特卡罗抽样。我们还提供了使用 posteriordb 进行模型评估和比较的最佳实践指南。为了提供广泛的现实目标密度，posteriordb 目前包括 120 个代表性模型，并在开发几种通用推断算法方面发挥了重要作用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

arXiv - STAT - Computation

自引率

0.00%

发文量