{"title":"利用序列近邻逼近对截断多变量正态进行可扩展采样","authors":"Jian Cao, Matthias Katzfuss","doi":"arxiv-2406.17307","DOIUrl":null,"url":null,"abstract":"We propose a linear-complexity method for sampling from truncated\nmultivariate normal (TMVN) distributions with high fidelity by applying\nnearest-neighbor approximations to a product-of-conditionals decomposition of\nthe TMVN density. To make the sequential sampling based on the decomposition\nfeasible, we introduce a novel method that avoids the intractable\nhigh-dimensional TMVN distribution by sampling sequentially from\n$m$-dimensional TMVN distributions, where $m$ is a tuning parameter controlling\nthe fidelity. This allows us to overcome the existing methods' crucial problem\nof rapidly decreasing acceptance rates for increasing dimension. Throughout our\nexperiments with up to tens of thousands of dimensions, we can produce\nhigh-fidelity samples with $m$ in the dozens, achieving superior scalability\ncompared to existing state-of-the-art methods. We study a tetrachloroethylene\nconcentration dataset that has $3{,}971$ observed responses and $20{,}730$\nundetected responses, together modeled as a partially censored Gaussian\nprocess, where our method enables posterior inference for the censored\nresponses through sampling a $20{,}730$-dimensional TMVN distribution.","PeriodicalId":501215,"journal":{"name":"arXiv - STAT - Computation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scalable Sampling of Truncated Multivariate Normals Using Sequential Nearest-Neighbor Approximation\",\"authors\":\"Jian Cao, Matthias Katzfuss\",\"doi\":\"arxiv-2406.17307\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a linear-complexity method for sampling from truncated\\nmultivariate normal (TMVN) distributions with high fidelity by applying\\nnearest-neighbor approximations to a product-of-conditionals decomposition of\\nthe TMVN density. To make the sequential sampling based on the decomposition\\nfeasible, we introduce a novel method that avoids the intractable\\nhigh-dimensional TMVN distribution by sampling sequentially from\\n$m$-dimensional TMVN distributions, where $m$ is a tuning parameter controlling\\nthe fidelity. This allows us to overcome the existing methods' crucial problem\\nof rapidly decreasing acceptance rates for increasing dimension. Throughout our\\nexperiments with up to tens of thousands of dimensions, we can produce\\nhigh-fidelity samples with $m$ in the dozens, achieving superior scalability\\ncompared to existing state-of-the-art methods. We study a tetrachloroethylene\\nconcentration dataset that has $3{,}971$ observed responses and $20{,}730$\\nundetected responses, together modeled as a partially censored Gaussian\\nprocess, where our method enables posterior inference for the censored\\nresponses through sampling a $20{,}730$-dimensional TMVN distribution.\",\"PeriodicalId\":501215,\"journal\":{\"name\":\"arXiv - STAT - Computation\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Computation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2406.17307\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2406.17307","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Scalable Sampling of Truncated Multivariate Normals Using Sequential Nearest-Neighbor Approximation
We propose a linear-complexity method for sampling from truncated
multivariate normal (TMVN) distributions with high fidelity by applying
nearest-neighbor approximations to a product-of-conditionals decomposition of
the TMVN density. To make the sequential sampling based on the decomposition
feasible, we introduce a novel method that avoids the intractable
high-dimensional TMVN distribution by sampling sequentially from
$m$-dimensional TMVN distributions, where $m$ is a tuning parameter controlling
the fidelity. This allows us to overcome the existing methods' crucial problem
of rapidly decreasing acceptance rates for increasing dimension. Throughout our
experiments with up to tens of thousands of dimensions, we can produce
high-fidelity samples with $m$ in the dozens, achieving superior scalability
compared to existing state-of-the-art methods. We study a tetrachloroethylene
concentration dataset that has $3{,}971$ observed responses and $20{,}730$
undetected responses, together modeled as a partially censored Gaussian
process, where our method enables posterior inference for the censored
responses through sampling a $20{,}730$-dimensional TMVN distribution.