用于高维量化预测的稀疏 PAC-Bayesian 方法

arXiv - STAT - Statistics Theory Pub Date : 2024-09-03 DOI:arxiv-2409.01687

The Tien Mai

{"title":"用于高维量化预测的稀疏 PAC-Bayesian 方法","authors":"The Tien Mai","doi":"arxiv-2409.01687","DOIUrl":null,"url":null,"abstract":"Quantile regression, a robust method for estimating conditional quantiles,\nhas advanced significantly in fields such as econometrics, statistics, and\nmachine learning. In high-dimensional settings, where the number of covariates\nexceeds sample size, penalized methods like lasso have been developed to\naddress sparsity challenges. Bayesian methods, initially connected to quantile\nregression via the asymmetric Laplace likelihood, have also evolved, though\nissues with posterior variance have led to new approaches, including\npseudo/score likelihoods. This paper presents a novel probabilistic machine\nlearning approach for high-dimensional quantile prediction. It uses a\npseudo-Bayesian framework with a scaled Student-t prior and Langevin Monte\nCarlo for efficient computation. The method demonstrates strong theoretical\nguarantees, through PAC-Bayes bounds, that establish non-asymptotic oracle\ninequalities, showing minimax-optimal prediction error and adaptability to\nunknown sparsity. Its effectiveness is validated through simulations and\nreal-world data, where it performs competitively against established\nfrequentist and Bayesian techniques.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"53 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A sparse PAC-Bayesian approach for high-dimensional quantile prediction\",\"authors\":\"The Tien Mai\",\"doi\":\"arxiv-2409.01687\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Quantile regression, a robust method for estimating conditional quantiles,\\nhas advanced significantly in fields such as econometrics, statistics, and\\nmachine learning. In high-dimensional settings, where the number of covariates\\nexceeds sample size, penalized methods like lasso have been developed to\\naddress sparsity challenges. Bayesian methods, initially connected to quantile\\nregression via the asymmetric Laplace likelihood, have also evolved, though\\nissues with posterior variance have led to new approaches, including\\npseudo/score likelihoods. This paper presents a novel probabilistic machine\\nlearning approach for high-dimensional quantile prediction. It uses a\\npseudo-Bayesian framework with a scaled Student-t prior and Langevin Monte\\nCarlo for efficient computation. The method demonstrates strong theoretical\\nguarantees, through PAC-Bayes bounds, that establish non-asymptotic oracle\\ninequalities, showing minimax-optimal prediction error and adaptability to\\nunknown sparsity. Its effectiveness is validated through simulations and\\nreal-world data, where it performs competitively against established\\nfrequentist and Bayesian techniques.\",\"PeriodicalId\":501379,\"journal\":{\"name\":\"arXiv - STAT - Statistics Theory\",\"volume\":\"53 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - STAT - Statistics Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.01687\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.01687","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

量子回归是一种用于估计条件量值的稳健方法，在计量经济学、统计学和机器学习等领域取得了长足的进步。在高维环境中，协方差的数量超过了样本大小，为了解决稀疏性难题，人们开发了拉索（lasso）等惩罚性方法。贝叶斯方法最初是通过非对称拉普拉斯似然与量子回归联系在一起的，现在也得到了发展，不过后验方差的问题导致了新方法的出现，包括伪似然/分数似然。本文介绍了一种用于高维量化预测的新型概率机器学习方法。该方法采用伪贝叶斯框架，带有按比例的 Student-t 先验和用于高效计算的 Langevin MonteCarlo。该方法通过 PAC-Bayes 边界提供了强有力的理论保证，建立了非渐近的甲骨文方程，显示了最小最优预测误差和对已知稀疏性的适应性。它的有效性通过模拟和现实世界的数据得到了验证，在这些数据中，它的表现与已有的频率主义和贝叶斯技术相比具有竞争力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A sparse PAC-Bayesian approach for high-dimensional quantile prediction

Quantile regression, a robust method for estimating conditional quantiles, has advanced significantly in fields such as econometrics, statistics, and machine learning. In high-dimensional settings, where the number of covariates exceeds sample size, penalized methods like lasso have been developed to address sparsity challenges. Bayesian methods, initially connected to quantile regression via the asymmetric Laplace likelihood, have also evolved, though issues with posterior variance have led to new approaches, including pseudo/score likelihoods. This paper presents a novel probabilistic machine learning approach for high-dimensional quantile prediction. It uses a pseudo-Bayesian framework with a scaled Student-t prior and Langevin Monte Carlo for efficient computation. The method demonstrates strong theoretical guarantees, through PAC-Bayes bounds, that establish non-asymptotic oracle inequalities, showing minimax-optimal prediction error and adaptability to unknown sparsity. Its effectiveness is validated through simulations and real-world data, where it performs competitively against established frequentist and Bayesian techniques.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - STAT - Statistics Theory

自引率

0.00%

发文量