自适应 MCMC 的强化学习

Congye Wang, Wilson Chen, Heishiro Kanagawa, Chris. J. Oates
{"title":"自适应 MCMC 的强化学习","authors":"Congye Wang, Wilson Chen, Heishiro Kanagawa, Chris. J. Oates","doi":"arxiv-2405.13574","DOIUrl":null,"url":null,"abstract":"An informal observation, made by several authors, is that the adaptive design\nof a Markov transition kernel has the flavour of a reinforcement learning task.\nYet, to-date it has remained unclear how to actually exploit modern\nreinforcement learning technologies for adaptive MCMC. The aim of this paper is\nto set out a general framework, called Reinforcement Learning\nMetropolis--Hastings, that is theoretically supported and empirically\nvalidated. Our principal focus is on learning fast-mixing Metropolis--Hastings\ntransition kernels, which we cast as deterministic policies and optimise via a\npolicy gradient. Control of the learning rate provably ensures conditions for\nergodicity are satisfied. The methodology is used to construct a gradient-free\nsampler that out-performs a popular gradient-free adaptive Metropolis--Hastings\nalgorithm on $\\approx 90 \\%$ of tasks in the PosteriorDB benchmark.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement Learning for Adaptive MCMC\",\"authors\":\"Congye Wang, Wilson Chen, Heishiro Kanagawa, Chris. J. Oates\",\"doi\":\"arxiv-2405.13574\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An informal observation, made by several authors, is that the adaptive design\\nof a Markov transition kernel has the flavour of a reinforcement learning task.\\nYet, to-date it has remained unclear how to actually exploit modern\\nreinforcement learning technologies for adaptive MCMC. The aim of this paper is\\nto set out a general framework, called Reinforcement Learning\\nMetropolis--Hastings, that is theoretically supported and empirically\\nvalidated. Our principal focus is on learning fast-mixing Metropolis--Hastings\\ntransition kernels, which we cast as deterministic policies and optimise via a\\npolicy gradient. Control of the learning rate provably ensures conditions for\\nergodicity are satisfied. The methodology is used to construct a gradient-free\\nsampler that out-performs a popular gradient-free adaptive Metropolis--Hastings\\nalgorithm on $\\\\approx 90 \\\\%$ of tasks in the PosteriorDB benchmark.\",\"PeriodicalId\":501215,\"journal\":{\"name\":\"arXiv - STAT - Computation\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Computation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2405.13574\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2405.13574","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

然而,迄今为止,如何将现代强化学习技术用于自适应 MCMC 仍然是个未知数。本文的目的是建立一个名为 "强化学习大都会--哈斯廷斯"(Reinforcement LearningMetropolis--Hastings)的总体框架,该框架具有理论支持和经验验证。我们的主要重点是学习快速混合的大都会--哈斯廷斯过渡核,并将其作为确定性策略,通过策略梯度进行优化。对学习率的控制可确保满足正交性条件。该方法被用于构建一个梯度自由采样器,在PosteriorDB基准测试中,该采样器在大约90%的任务上优于流行的无梯度自适应Metropolis--Hastings算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Reinforcement Learning for Adaptive MCMC
An informal observation, made by several authors, is that the adaptive design of a Markov transition kernel has the flavour of a reinforcement learning task. Yet, to-date it has remained unclear how to actually exploit modern reinforcement learning technologies for adaptive MCMC. The aim of this paper is to set out a general framework, called Reinforcement Learning Metropolis--Hastings, that is theoretically supported and empirically validated. Our principal focus is on learning fast-mixing Metropolis--Hastings transition kernels, which we cast as deterministic policies and optimise via a policy gradient. Control of the learning rate provably ensures conditions for ergodicity are satisfied. The methodology is used to construct a gradient-free sampler that out-performs a popular gradient-free adaptive Metropolis--Hastings algorithm on $\approx 90 \%$ of tasks in the PosteriorDB benchmark.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信