R *:基于决策树分类器的鲁棒不确定性MCMC收敛诊断

IF 4.9 2区数学 Q1 MATHEMATICS, INTERDISCIPLINARY APPLICATIONS

Bayesian Analysis Pub Date : 2020-11-19 DOI:10.1214/20-ba1252

Ben Lambert, Aki Vehtari

{"title":"R *:基于决策树分类器的鲁棒不确定性MCMC收敛诊断","authors":"Ben Lambert, Aki Vehtari","doi":"10.1214/20-ba1252","DOIUrl":null,"url":null,"abstract":"Markov chain Monte Carlo (MCMC) has transformed Bayesian model inference over the past three decades: mainly because of this, Bayesian inference is now a workhorse of applied scientists. Under general conditions, MCMC sampling converges asymptotically to the posterior distribution, but this provides no guarantees about its performance in finite time. The predominant method for monitoring convergence is to run multiple chains and monitor individual chains' characteristics and compare these to the population as a whole: if within-chain and between-chain summaries are comparable, then this is taken to indicate that the chains have converged to a common stationary distribution. Here, we introduce a new method for diagnosing convergence based on how well a machine learning classifier model can successfully discriminate the individual chains. We call this convergence measure $R^*$. In contrast to the predominant $\\widehat{R}$, $R^*$ is a single statistic across all parameters that indicates lack of mixing, although individual variables' importance for this metric can also be determined. Additionally, $R^*$ is not based on any single characteristic of the sampling distribution; instead it uses all the information in the chain, including that given by the joint sampling distribution, which is currently largely overlooked by existing approaches. We recommend calculating $R^*$ using two different machine learning classifiers - gradient-boosted regression trees and random forests - which each work well in models of different dimensions. Because each of these methods outputs a classification probability, as a byproduct, we obtain uncertainty in $R^*$. The method is straightforward to implement and could be a complementary additional check on MCMC convergence for applied analyses.","PeriodicalId":55398,"journal":{"name":"Bayesian Analysis","volume":" ","pages":""},"PeriodicalIF":4.9000,"publicationDate":"2020-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"R ∗ : A Robust MCMC Convergence Diagnostic with Uncertainty Using Decision Tree Classifiers\",\"authors\":\"Ben Lambert, Aki Vehtari\",\"doi\":\"10.1214/20-ba1252\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Markov chain Monte Carlo (MCMC) has transformed Bayesian model inference over the past three decades: mainly because of this, Bayesian inference is now a workhorse of applied scientists. Under general conditions, MCMC sampling converges asymptotically to the posterior distribution, but this provides no guarantees about its performance in finite time. The predominant method for monitoring convergence is to run multiple chains and monitor individual chains' characteristics and compare these to the population as a whole: if within-chain and between-chain summaries are comparable, then this is taken to indicate that the chains have converged to a common stationary distribution. Here, we introduce a new method for diagnosing convergence based on how well a machine learning classifier model can successfully discriminate the individual chains. We call this convergence measure $R^*$. In contrast to the predominant $\\\\widehat{R}$, $R^*$ is a single statistic across all parameters that indicates lack of mixing, although individual variables' importance for this metric can also be determined. Additionally, $R^*$ is not based on any single characteristic of the sampling distribution; instead it uses all the information in the chain, including that given by the joint sampling distribution, which is currently largely overlooked by existing approaches. We recommend calculating $R^*$ using two different machine learning classifiers - gradient-boosted regression trees and random forests - which each work well in models of different dimensions. Because each of these methods outputs a classification probability, as a byproduct, we obtain uncertainty in $R^*$. The method is straightforward to implement and could be a complementary additional check on MCMC convergence for applied analyses.\",\"PeriodicalId\":55398,\"journal\":{\"name\":\"Bayesian Analysis\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2020-11-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Bayesian Analysis\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1214/20-ba1252\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bayesian Analysis","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/20-ba1252","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 14

摘要

在过去的三十年里，马尔可夫链蒙特卡罗(MCMC)已经改变了贝叶斯模型推理:主要是因为这一点，贝叶斯推理现在是应用科学家的主力。在一般情况下，MCMC抽样逐渐收敛于后验分布，但这并不能保证其在有限时间内的性能。监测收敛的主要方法是运行多个链并监测单个链的特征，并将这些特征与总体进行比较:如果链内和链间总结具有可比性，则表明链已收敛到一个共同的平稳分布。在这里，我们介绍了一种基于机器学习分类器模型成功区分单个链的程度来诊断收敛的新方法。我们称这个收敛测度为R^*。与主要的$\widehat{R}$相反，$R^*$是所有参数的单一统计数据，表明缺乏混合，尽管单个变量对该度量的重要性也可以确定。此外，$R^*$不是基于抽样分布的任何单一特征;相反，它使用链中的所有信息，包括联合抽样分布给出的信息，这些信息目前在很大程度上被现有方法所忽略。我们建议使用两种不同的机器学习分类器——梯度增强回归树和随机森林——来计算$R^*$，这两种分类器在不同维度的模型中都能很好地工作。因为这些方法中的每一种都输出一个分类概率，作为副产品，我们在$R^*$中获得不确定性。该方法易于实现，可以作为应用分析中MCMC收敛性的补充检查。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

R ∗ : A Robust MCMC Convergence Diagnostic with Uncertainty Using Decision Tree Classifiers

Markov chain Monte Carlo (MCMC) has transformed Bayesian model inference over the past three decades: mainly because of this, Bayesian inference is now a workhorse of applied scientists. Under general conditions, MCMC sampling converges asymptotically to the posterior distribution, but this provides no guarantees about its performance in finite time. The predominant method for monitoring convergence is to run multiple chains and monitor individual chains' characteristics and compare these to the population as a whole: if within-chain and between-chain summaries are comparable, then this is taken to indicate that the chains have converged to a common stationary distribution. Here, we introduce a new method for diagnosing convergence based on how well a machine learning classifier model can successfully discriminate the individual chains. We call this convergence measure $R^*$. In contrast to the predominant $\widehat{R}$, $R^*$ is a single statistic across all parameters that indicates lack of mixing, although individual variables' importance for this metric can also be determined. Additionally, $R^*$ is not based on any single characteristic of the sampling distribution; instead it uses all the information in the chain, including that given by the joint sampling distribution, which is currently largely overlooked by existing approaches. We recommend calculating $R^*$ using two different machine learning classifiers - gradient-boosted regression trees and random forests - which each work well in models of different dimensions. Because each of these methods outputs a classification probability, as a byproduct, we obtain uncertainty in $R^*$. The method is straightforward to implement and could be a complementary additional check on MCMC convergence for applied analyses.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Bayesian Analysis 数学-数学跨学科应用

CiteScore

6.50

自引率

13.60%

发文量

审稿时长

>12 weeks

期刊介绍： Bayesian Analysis is an electronic journal of the International Society for Bayesian Analysis. It seeks to publish a wide range of articles that demonstrate or discuss Bayesian methods in some theoretical or applied context. The journal welcomes submissions involving presentation of new computational and statistical methods; critical reviews and discussions of existing approaches; historical perspectives; description of important scientific or policy application areas; case studies; and methods for experimental design, data collection, data sharing, or data mining. Evaluation of submissions is based on importance of content and effectiveness of communication. Discussion papers are typically chosen by the Editor in Chief, or suggested by an Editor, among the regular submissions. In addition, the Journal encourages individual authors to submit manuscripts for consideration as discussion papers.