arXiv - STAT - Statistics Theory最新文献

筛选
英文 中文
Asymptotics for Random Quadratic Transportation Costs 随机二次运输成本的渐近线
arXiv - STAT - Statistics Theory Pub Date : 2024-09-13 DOI: arxiv-2409.08612
Martin Huesmann, Michael Goldman, Dario Trevisan
{"title":"Asymptotics for Random Quadratic Transportation Costs","authors":"Martin Huesmann, Michael Goldman, Dario Trevisan","doi":"arxiv-2409.08612","DOIUrl":"https://doi.org/arxiv-2409.08612","url":null,"abstract":"We establish the validity of asymptotic limits for the general transportation\u0000problem between random i.i.d. points and their common distribution, with\u0000respect to the squared Euclidean distance cost, in any dimension larger than\u0000three. Previous results were essentially limited to the two (or one)\u0000dimensional case, or to distributions whose absolutely continuous part is\u0000uniform. The proof relies upon recent advances in the stability theory of optimal\u0000transportation, combined with functional analytic techniques and some ideas\u0000from quantitative stochastic homogenization. The key tool we develop is a\u0000quantitative upper bound for the usual quadratic optimal transportation problem\u0000in terms of its boundary variant, where points can be freely transported along\u0000the boundary. The methods we use are applicable to more general random\u0000measures, including occupation measure of Brownian paths, and may open the door\u0000to further progress on challenging problems at the interface of analysis,\u0000probability, and discrete mathematics.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On spiked eigenvalues of a renormalized sample covariance matrix from multi-population 关于来自多人群的重归一化样本协方差矩阵的尖峰特征值
arXiv - STAT - Statistics Theory Pub Date : 2024-09-13 DOI: arxiv-2409.08715
Weiming Li, Zeng Li, Junpeng Zhu
{"title":"On spiked eigenvalues of a renormalized sample covariance matrix from multi-population","authors":"Weiming Li, Zeng Li, Junpeng Zhu","doi":"arxiv-2409.08715","DOIUrl":"https://doi.org/arxiv-2409.08715","url":null,"abstract":"Sample covariance matrices from multi-population typically exhibit several\u0000large spiked eigenvalues, which stem from differences between population means\u0000and are crucial for inference on the underlying data structure. This paper\u0000investigates the asymptotic properties of spiked eigenvalues of a renormalized\u0000sample covariance matrices from multi-population in the ultrahigh dimensional\u0000context where the dimension-to-sample size ratio p/n go to infinity. The first-\u0000and second-order convergence of these spikes are established based on\u0000asymptotic properties of three types of sesquilinear forms from\u0000multi-population. These findings are further applied to two scenarios,including\u0000determination of total number of subgroups and a new criterion for evaluating\u0000clustering results in the absence of true labels. Additionally, we provide a\u0000unified framework with p/n->cin (0,infty] that integrates the asymptotic\u0000results in both high and ultrahigh dimensional settings.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved Finite-Particle Convergence Rates for Stein Variational Gradient Descent 改进斯坦因变分梯度下降的有限粒子收敛速率
arXiv - STAT - Statistics Theory Pub Date : 2024-09-13 DOI: arxiv-2409.08469
Krishnakumar Balasubramanian, Sayan Banerjee, Promit Ghosal
{"title":"Improved Finite-Particle Convergence Rates for Stein Variational Gradient Descent","authors":"Krishnakumar Balasubramanian, Sayan Banerjee, Promit Ghosal","doi":"arxiv-2409.08469","DOIUrl":"https://doi.org/arxiv-2409.08469","url":null,"abstract":"We provide finite-particle convergence rates for the Stein Variational\u0000Gradient Descent (SVGD) algorithm in the Kernel Stein Discrepancy\u0000($mathsf{KSD}$) and Wasserstein-2 metrics. Our key insight is the observation\u0000that the time derivative of the relative entropy between the joint density of\u0000$N$ particle locations and the $N$-fold product target measure, starting from a\u0000regular initial distribution, splits into a dominant `negative part'\u0000proportional to $N$ times the expected $mathsf{KSD}^2$ and a smaller `positive\u0000part'. This observation leads to $mathsf{KSD}$ rates of order $1/sqrt{N}$,\u0000providing a near optimal double exponential improvement over the recent result\u0000by~cite{shi2024finite}. Under mild assumptions on the kernel and potential,\u0000these bounds also grow linearly in the dimension $d$. By adding a bilinear\u0000component to the kernel, the above approach is used to further obtain\u0000Wasserstein-2 convergence. For the case of `bilinear + Mat'ern' kernels, we\u0000derive Wasserstein-2 rates that exhibit a curse-of-dimensionality similar to\u0000the i.i.d. setting. We also obtain marginal convergence and long-time\u0000propagation of chaos results for the time-averaged particle laws.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal Classification-based Anomaly Detection with Neural Networks: Theory and Practice 基于分类的神经网络优化异常检测:理论与实践
arXiv - STAT - Statistics Theory Pub Date : 2024-09-13 DOI: arxiv-2409.08521
Tian-Yi Zhou, Matthew Lau, Jizhou Chen, Wenke Lee, Xiaoming Huo
{"title":"Optimal Classification-based Anomaly Detection with Neural Networks: Theory and Practice","authors":"Tian-Yi Zhou, Matthew Lau, Jizhou Chen, Wenke Lee, Xiaoming Huo","doi":"arxiv-2409.08521","DOIUrl":"https://doi.org/arxiv-2409.08521","url":null,"abstract":"Anomaly detection is an important problem in many application areas, such as\u0000network security. Many deep learning methods for unsupervised anomaly detection\u0000produce good empirical performance but lack theoretical guarantees. By casting\u0000anomaly detection into a binary classification problem, we establish\u0000non-asymptotic upper bounds and a convergence rate on the excess risk on\u0000rectified linear unit (ReLU) neural networks trained on synthetic anomalies.\u0000Our convergence rate on the excess risk matches the minimax optimal rate in the\u0000literature. Furthermore, we provide lower and upper bounds on the number of\u0000synthetic anomalies that can attain this optimality. For practical\u0000implementation, we relax some conditions to improve the search for the\u0000empirical risk minimizer, which leads to competitive performance to other\u0000classification-based methods for anomaly detection. Overall, our work provides\u0000the first theoretical guarantees of unsupervised neural network-based anomaly\u0000detectors and empirical insights on how to design them well.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the maximal correlation coefficient for the bivariate Marshall Olkin distribution 关于二元马歇尔-奥尔金分布的最大相关系数
arXiv - STAT - Statistics Theory Pub Date : 2024-09-13 DOI: arxiv-2409.08661
Axel Bücher, Torben Staud
{"title":"On the maximal correlation coefficient for the bivariate Marshall Olkin distribution","authors":"Axel Bücher, Torben Staud","doi":"arxiv-2409.08661","DOIUrl":"https://doi.org/arxiv-2409.08661","url":null,"abstract":"We prove a formula for the maximal correlation coefficient of the bivariate\u0000Marshall Olkin distribution that was conjectured in Lin, Lai, and Govindaraju\u0000(2016, Stat. Methodol., 29:1-9). The formula is applied to obtain a new proof\u0000for a variance inequality in extreme value statistics that links the disjoint\u0000and the sliding block maxima method.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Admissibility in Bipartite Incidence Graph Sampling 论双方位发生图抽样中的可采性
arXiv - STAT - Statistics Theory Pub Date : 2024-09-12 DOI: arxiv-2409.07970
Pedro García-Segador, Li-Chung Zhang
{"title":"On Admissibility in Bipartite Incidence Graph Sampling","authors":"Pedro García-Segador, Li-Chung Zhang","doi":"arxiv-2409.07970","DOIUrl":"https://doi.org/arxiv-2409.07970","url":null,"abstract":"In bipartite incidence graph sampling, the target study units may be formed\u0000as connected population elements, which are distinct to the units of sampling\u0000and there may exist generally more than one way by which a given study unit can\u0000be observed via sampling units. This generalizes ?nite-population element or\u0000multistage sampling, where each element can only be sampled directly or via a\u0000single primary sampling unit. We study the admissibility of estimators in\u0000bipartite incidence graph sampling and identify other admissible estimators\u0000than the classic Horvitz-Thompson estimator. Our admissibility results\u0000encompass those for ?nite-population sampling.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalized Independence Test for Modern Data 现代数据的广义独立性检验
arXiv - STAT - Statistics Theory Pub Date : 2024-09-12 DOI: arxiv-2409.07745
Mingshuo Liu, Doudou Zhou, Hao Chen
{"title":"Generalized Independence Test for Modern Data","authors":"Mingshuo Liu, Doudou Zhou, Hao Chen","doi":"arxiv-2409.07745","DOIUrl":"https://doi.org/arxiv-2409.07745","url":null,"abstract":"The test of independence is a crucial component of modern data analysis.\u0000However, traditional methods often struggle with the complex dependency\u0000structures found in high-dimensional data. To overcome this challenge, we\u0000introduce a novel test statistic that captures intricate relationships using\u0000similarity and dissimilarity information derived from the data. The statistic\u0000exhibits strong power across a broad range of alternatives for high-dimensional\u0000data, as demonstrated in extensive simulation studies. Under mild conditions,\u0000we show that the new test statistic converges to the $chi^2_4$ distribution\u0000under the permutation null distribution, ensuring straightforward type I error\u0000control. Furthermore, our research advances the moment method in proving the\u0000joint asymptotic normality of multiple double-indexed permutation statistics.\u0000We showcase the practical utility of this new test with an application to the\u0000Genotype-Tissue Expression dataset, where it effectively measures associations\u0000between human tissues.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quickest Change Detection Using Mismatched CUSUM 使用不匹配 CUSUM 快速检测变化
arXiv - STAT - Statistics Theory Pub Date : 2024-09-12 DOI: arxiv-2409.07948
Austin Cooper, Sean Meyn
{"title":"Quickest Change Detection Using Mismatched CUSUM","authors":"Austin Cooper, Sean Meyn","doi":"arxiv-2409.07948","DOIUrl":"https://doi.org/arxiv-2409.07948","url":null,"abstract":"The field of quickest change detection (QCD) concerns design and analysis of\u0000algorithms to estimate in real time the time at which an important event takes\u0000place and identify properties of the post-change behavior. The goal is to\u0000devise a stopping time adapted to the observations that minimizes an $L_1$\u0000loss. Approximately optimal solutions are well known under a variety of\u0000assumptions. In the work surveyed here we consider the CUSUM statistic, which\u0000is defined as a one-dimensional reflected random walk driven by a functional of\u0000the observations. It is known that the optimal functional is a log likelihood\u0000ratio subject to special statical assumptions. The paper concerns model free approaches to detection design, considering the\u0000following questions: 1. What is the performance for a given functional of the observations? 2. How do the conclusions change when there is dependency between pre- and\u0000post-change behavior? 3. How can techniques from statistics and machine learning be adapted to\u0000approximate the best functional in a given class?","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Foundation of Calculating Normalized Maximum Likelihood for Continuous Probability Models 计算连续概率模型归一化最大似然的基础
arXiv - STAT - Statistics Theory Pub Date : 2024-09-12 DOI: arxiv-2409.08387
Atsushi Suzuki, Kota Fukuzawa, Kenji Yamanishi
{"title":"Foundation of Calculating Normalized Maximum Likelihood for Continuous Probability Models","authors":"Atsushi Suzuki, Kota Fukuzawa, Kenji Yamanishi","doi":"arxiv-2409.08387","DOIUrl":"https://doi.org/arxiv-2409.08387","url":null,"abstract":"The normalized maximum likelihood (NML) code length is widely used as a model\u0000selection criterion based on the minimum description length principle, where\u0000the model with the shortest NML code length is selected. A common method to\u0000calculate the NML code length is to use the sum (for a discrete model) or\u0000integral (for a continuous model) of a function defined by the distribution of\u0000the maximum likelihood estimator. While this method has been proven to\u0000correctly calculate the NML code length of discrete models, no proof has been\u0000provided for continuous cases. Consequently, it has remained unclear whether\u0000the method can accurately calculate the NML code length of continuous models.\u0000In this paper, we solve this problem affirmatively, proving that the method is\u0000also correct for continuous cases. Remarkably, completing the proof for\u0000continuous cases is non-trivial in that it cannot be achieved by merely\u0000replacing the sums in discrete cases with integrals, as the decomposition trick\u0000applied to sums in the discrete model case proof is not applicable to integrals\u0000in the continuous model case proof. To overcome this, we introduce a novel\u0000decomposition approach based on the coarea formula from geometric measure\u0000theory, which is essential to establishing our proof for continuous cases.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ratio Divergence Learning Using Target Energy in Restricted Boltzmann Machines: Beyond Kullback--Leibler Divergence Learning 利用受限玻尔兹曼机中的目标能量进行比率发散学习:超越库尔贝克--莱布勒发散学习
arXiv - STAT - Statistics Theory Pub Date : 2024-09-12 DOI: arxiv-2409.07679
Yuichi Ishida, Yuma Ichikawa, Aki Dote, Toshiyuki Miyazawa, Koji Hukushima
{"title":"Ratio Divergence Learning Using Target Energy in Restricted Boltzmann Machines: Beyond Kullback--Leibler Divergence Learning","authors":"Yuichi Ishida, Yuma Ichikawa, Aki Dote, Toshiyuki Miyazawa, Koji Hukushima","doi":"arxiv-2409.07679","DOIUrl":"https://doi.org/arxiv-2409.07679","url":null,"abstract":"We propose ratio divergence (RD) learning for discrete energy-based models, a\u0000method that utilizes both training data and a tractable target energy function.\u0000We apply RD learning to restricted Boltzmann machines (RBMs), which are a\u0000minimal model that satisfies the universal approximation theorem for discrete\u0000distributions. RD learning combines the strength of both forward and reverse\u0000Kullback-Leibler divergence (KLD) learning, effectively addressing the\u0000\"notorious\" issues of underfitting with the forward KLD and mode-collapse with\u0000the reverse KLD. Since the summation of forward and reverse KLD seems to be\u0000sufficient to combine the strength of both approaches, we include this learning\u0000method as a direct baseline in numerical experiments to evaluate its\u0000effectiveness. Numerical experiments demonstrate that RD learning significantly\u0000outperforms other learning methods in terms of energy function fitting,\u0000mode-covering, and learning stability across various discrete energy-based\u0000models. Moreover, the performance gaps between RD learning and the other\u0000learning methods become more pronounced as the dimensions of target models\u0000increase.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信