Ying Li, Zhidi Lin, Yuhao Liu, Michael Minyi Zhang, Pablo M Olmos, Petar M Djuric
{"title":"可扩展随机特征潜变量模型。","authors":"Ying Li, Zhidi Lin, Yuhao Liu, Michael Minyi Zhang, Pablo M Olmos, Petar M Djuric","doi":"10.1109/TPAMI.2025.3589728","DOIUrl":null,"url":null,"abstract":"<p><p>Random feature latent variable models (RFLVMs) are state-of-the-art tools for uncovering structure in high-dimensional, non-Gaussian data. However, their reliance on Monte Carlo sampling significantly limits scalability, posing challenges for large-scale applications. To overcome these limitations, we develop a scalable RFLVM framework based on variational Bayesian inference (VBI), a deterministic and optimization-based alternative to sampling methods. Applying VBI to RFLVMs is nontrivial due to two key challenges: (i) the lack of an explicit probability density function (PDF) for Dirichlet process (DP) mixing weights, and (ii) the inefficiency of existing VBI approaches when handling the high-dimensional variational parameters of RFLVMs. To address these issues, we adopt the stick-breaking construction for the DP, which provides an explicit and tractable PDF over mixing weights, and propose a novel inference algorithm, block coordinate descent variational inference (BCD-VI), which partitions variational parameters into blocks and applies tailored solvers to optimize them efficiently. The resulting scalable model, referred to as SRFLVM, supports various likelihoods; we demonstrate its effectiveness under Gaussian and logistic settings. Extensive experiments on diverse benchmark datasets show that SRFLVM achieves superior scalability, computational efficiency, and performance in latent representation learning and missing data imputation, consistently outperforming state-of-the-art latent variable models, including deep generative approaches.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6000,"publicationDate":"2025-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scalable Random Feature Latent Variable Models.\",\"authors\":\"Ying Li, Zhidi Lin, Yuhao Liu, Michael Minyi Zhang, Pablo M Olmos, Petar M Djuric\",\"doi\":\"10.1109/TPAMI.2025.3589728\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Random feature latent variable models (RFLVMs) are state-of-the-art tools for uncovering structure in high-dimensional, non-Gaussian data. However, their reliance on Monte Carlo sampling significantly limits scalability, posing challenges for large-scale applications. To overcome these limitations, we develop a scalable RFLVM framework based on variational Bayesian inference (VBI), a deterministic and optimization-based alternative to sampling methods. Applying VBI to RFLVMs is nontrivial due to two key challenges: (i) the lack of an explicit probability density function (PDF) for Dirichlet process (DP) mixing weights, and (ii) the inefficiency of existing VBI approaches when handling the high-dimensional variational parameters of RFLVMs. To address these issues, we adopt the stick-breaking construction for the DP, which provides an explicit and tractable PDF over mixing weights, and propose a novel inference algorithm, block coordinate descent variational inference (BCD-VI), which partitions variational parameters into blocks and applies tailored solvers to optimize them efficiently. The resulting scalable model, referred to as SRFLVM, supports various likelihoods; we demonstrate its effectiveness under Gaussian and logistic settings. Extensive experiments on diverse benchmark datasets show that SRFLVM achieves superior scalability, computational efficiency, and performance in latent representation learning and missing data imputation, consistently outperforming state-of-the-art latent variable models, including deep generative approaches.</p>\",\"PeriodicalId\":94034,\"journal\":{\"name\":\"IEEE transactions on pattern analysis and machine intelligence\",\"volume\":\"PP \",\"pages\":\"\"},\"PeriodicalIF\":18.6000,\"publicationDate\":\"2025-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on pattern analysis and machine intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TPAMI.2025.3589728\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPAMI.2025.3589728","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Random feature latent variable models (RFLVMs) are state-of-the-art tools for uncovering structure in high-dimensional, non-Gaussian data. However, their reliance on Monte Carlo sampling significantly limits scalability, posing challenges for large-scale applications. To overcome these limitations, we develop a scalable RFLVM framework based on variational Bayesian inference (VBI), a deterministic and optimization-based alternative to sampling methods. Applying VBI to RFLVMs is nontrivial due to two key challenges: (i) the lack of an explicit probability density function (PDF) for Dirichlet process (DP) mixing weights, and (ii) the inefficiency of existing VBI approaches when handling the high-dimensional variational parameters of RFLVMs. To address these issues, we adopt the stick-breaking construction for the DP, which provides an explicit and tractable PDF over mixing weights, and propose a novel inference algorithm, block coordinate descent variational inference (BCD-VI), which partitions variational parameters into blocks and applies tailored solvers to optimize them efficiently. The resulting scalable model, referred to as SRFLVM, supports various likelihoods; we demonstrate its effectiveness under Gaussian and logistic settings. Extensive experiments on diverse benchmark datasets show that SRFLVM achieves superior scalability, computational efficiency, and performance in latent representation learning and missing data imputation, consistently outperforming state-of-the-art latent variable models, including deep generative approaches.