Annals of Statistics最新文献

筛选
英文 中文
SPECTRAL METHOD AND REGULARIZED MLE ARE BOTH OPTIMAL FOR TOP-K RANKING. 谱方法和正则化MLE都是TOP-K排序的最优方法。
IF 4.5 1区 数学
Annals of Statistics Pub Date : 2019-01-01 Epub Date: 2019-05-21 DOI: 10.1214/18-AOS1745
Yuxin Chen, Jianqing Fan, Cong Ma, Kaizheng Wang
{"title":"SPECTRAL METHOD AND REGULARIZED MLE ARE BOTH OPTIMAL FOR TOP-<i>K</i> RANKING.","authors":"Yuxin Chen,&nbsp;Jianqing Fan,&nbsp;Cong Ma,&nbsp;Kaizheng Wang","doi":"10.1214/18-AOS1745","DOIUrl":"https://doi.org/10.1214/18-AOS1745","url":null,"abstract":"<p><p>This paper is concerned with the problem of top-<i>K</i> ranking from pairwise comparisons. Given a collection of <i>n</i> items and a few pairwise comparisons across them, one wishes to identify the set of <i>K</i> items that receive the highest ranks. To tackle this problem, we adopt the logistic parametric model - the Bradley-Terry-Luce model, where each item is assigned a latent preference score, and where the outcome of each pairwise comparison depends solely on the relative scores of the two items involved. Recent works have made significant progress towards characterizing the performance (e.g. the mean square error for estimating the scores) of several classical methods, including the spectral method and the maximum likelihood estimator (MLE). However, where they stand regarding top-<i>K</i> ranking remains unsettled. We demonstrate that under a natural random sampling model, the spectral method alone, or the regularized MLE alone, is minimax optimal in terms of the sample complexity - the number of paired comparisons needed to ensure exact top-<i>K</i> identification, for the fixed dynamic range regime. This is accomplished via optimal control of the entrywise error of the score estimates. We complement our theoretical studies by numerical experiments, confirming that both methods yield low entrywise errors for estimating the underlying scores. Our theory is established via a novel leave-one-out trick, which proves effective for analyzing both iterative and non-iterative procedures. Along the way, we derive an elementary eigenvector perturbation bound for probability transition matrices, which parallels the Davis-Kahan <math><mtext>Θ</mtext></math> theorem for symmetric matrices. This also allows us to close the gap between the <math><msub><mi>l</mi> <mn>2</mn></msub> </math> error upper bound for the spectral method and the minimax lower limit.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"47 4","pages":"2204-2235"},"PeriodicalIF":4.5,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1214/18-AOS1745","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41189337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 102
HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX. 高维协方差矩阵线性结构的假设检验。
IF 4.5 1区 数学
Annals of Statistics Pub Date : 2019-01-01 Epub Date: 2019-10-31 DOI: 10.1214/18-AOS1779
Shurong Zheng, Zhao Chen, Hengjian Cui, Runze Li
{"title":"HYPOTHESIS TESTING ON LINEAR STRUCTURES OF HIGH DIMENSIONAL COVARIANCE MATRIX.","authors":"Shurong Zheng,&nbsp;Zhao Chen,&nbsp;Hengjian Cui,&nbsp;Runze Li","doi":"10.1214/18-AOS1779","DOIUrl":"https://doi.org/10.1214/18-AOS1779","url":null,"abstract":"<p><p>This paper is concerned with test of significance on high dimensional covariance structures, and aims to develop a unified framework for testing commonly-used linear covariance structures. We first construct a consistent estimator for parameters involved in the linear covariance structure, and then develop two tests for the linear covariance structures based on entropy loss and quadratic loss used for covariance matrix estimation. To study the asymptotic properties of the proposed tests, we study related high dimensional random matrix theory, and establish several highly useful asymptotic results. With the aid of these asymptotic results, we derive the limiting distributions of these two tests under the null and alternative hypotheses. We further show that the quadratic loss based test is asymptotically unbiased. We conduct Monte Carlo simulation study to examine the finite sample performance of the two tests. Our simulation results show that the limiting null distributions approximate their null distributions quite well, and the corresponding asymptotic critical values keep Type I error rate very well. Our numerical comparison implies that the proposed tests outperform existing ones in terms of controlling Type I error rate and power. Our simulation indicates that the test based on quadratic loss seems to have better power than the test based on entropy loss.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"47 6","pages":"3300-3334"},"PeriodicalIF":4.5,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6910252/pdf/nihms-1022732.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37459228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
UNIFORMLY VALID POST-REGULARIZATION CONFIDENCE REGIONS FOR MANY FUNCTIONAL PARAMETERS IN Z-ESTIMATION FRAMEWORK. Z估计框架中许多函数参数的一致有效正则化后置信域。
IF 4.5 1区 数学
Annals of Statistics Pub Date : 2018-12-01 Epub Date: 2018-09-11 DOI: 10.1214/17-AOS1671
Alexandre Belloni, Victor Chernozhukov, Denis Chetverikov, Ying Wei
{"title":"UNIFORMLY VALID POST-REGULARIZATION CONFIDENCE REGIONS FOR MANY FUNCTIONAL PARAMETERS IN Z-ESTIMATION FRAMEWORK.","authors":"Alexandre Belloni,&nbsp;Victor Chernozhukov,&nbsp;Denis Chetverikov,&nbsp;Ying Wei","doi":"10.1214/17-AOS1671","DOIUrl":"10.1214/17-AOS1671","url":null,"abstract":"<p><p>In this paper, we develop procedures to construct simultaneous confidence bands for <math><mover><mi>p</mi> <mo>˜</mo></mover> </math> potentially infinite-dimensional parameters after model selection for general moment condition models where <math> <mrow><mover><mi>p</mi> <mo>˜</mo></mover> </mrow> </math> is potentially much larger than the sample size of available data, <i>n</i>. This allows us to cover settings with functional response data where each of the <math> <mrow><mover><mi>p</mi> <mo>˜</mo></mover> </mrow> </math> parameters is a function. The procedure is based on the construction of score functions that satisfy Neyman orthogonality condition approximately. The proposed simultaneous confidence bands rely on uniform central limit theorems for high-dimensional vectors (and not on Donsker arguments as we allow for <math> <mrow><mover><mi>p</mi> <mo>˜</mo></mover> <mo>≫</mo> <mi>n</mi></mrow> </math> ). To construct the bands, we employ a multiplier bootstrap procedure which is computationally efficient as it only involves resampling the estimated score functions (and does not require resolving the high-dimensional optimization problems). We formally apply the general theory to inference on regression coefficient process in the distribution regression model with a logistic link, where two implementations are analyzed in detail. Simulations and an application to real data are provided to help illustrate the applicability of the results.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"46 6B","pages":"3643-3675"},"PeriodicalIF":4.5,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1214/17-AOS1671","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37129329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 71
ASSESSING ROBUSTNESS OF CLASSIFICATION USING ANGULAR BREAKDOWN POINT. 使用角度分解点评估分类的稳健性。
IF 4.5 1区 数学
Annals of Statistics Pub Date : 2018-12-01 Epub Date: 2018-09-11 DOI: 10.1214/17-AOS1661
Junlong Zhao, Guan Yu, Yufeng Liu
{"title":"ASSESSING ROBUSTNESS OF CLASSIFICATION USING ANGULAR BREAKDOWN POINT.","authors":"Junlong Zhao,&nbsp;Guan Yu,&nbsp;Yufeng Liu","doi":"10.1214/17-AOS1661","DOIUrl":"10.1214/17-AOS1661","url":null,"abstract":"<p><p>Robustness is a desirable property for many statistical techniques. As an important measure of robustness, breakdown point has been widely used for regression problems and many other settings. Despite the existing development, we observe that the standard breakdown point criterion is not directly applicable for many classification problems. In this paper, we propose a new breakdown point criterion, namely angular breakdown point, to better quantify the robustness of different classification methods. Using this new breakdown point criterion, we study the robustness of binary large margin classification techniques, although the idea is applicable to general classification methods. Both bounded and unbounded loss functions with linear and kernel learning are considered. These studies provide useful insights on the robustness of different classification methods. Numerical results further confirm our theoretical findings.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"46 6B","pages":"3362-3389"},"PeriodicalIF":4.5,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1214/17-AOS1661","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36564699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A NEW PERSPECTIVE ON ROBUST M-ESTIMATION: FINITE SAMPLE THEORY AND APPLICATIONS TO DEPENDENCE-ADJUSTED MULTIPLE TESTING. 稳健M估计的新视角:有限样本理论及其在依赖调整多重检验中的应用。
IF 4.5 1区 数学
Annals of Statistics Pub Date : 2018-10-01 Epub Date: 2018-08-17 DOI: 10.1214/17-AOS1606
Wen-Xin Zhou, Koushiki Bose, Jianqing Fan, Han Liu
{"title":"A NEW PERSPECTIVE ON ROBUST <i>M</i>-ESTIMATION: FINITE SAMPLE THEORY AND APPLICATIONS TO DEPENDENCE-ADJUSTED MULTIPLE TESTING.","authors":"Wen-Xin Zhou, Koushiki Bose, Jianqing Fan, Han Liu","doi":"10.1214/17-AOS1606","DOIUrl":"10.1214/17-AOS1606","url":null,"abstract":"<p><p>Heavy-tailed errors impair the accuracy of the least squares estimate, which can be spoiled by a single grossly outlying observation. As argued in the seminal work of Peter Huber in 1973 [<i>Ann. Statist.</i><b>1</b> (1973) 799-821], robust alternatives to the method of least squares are sorely needed. To achieve robustness against heavy-tailed sampling distributions, we revisit the Huber estimator from a new perspective by letting the tuning parameter involved diverge with the sample size. In this paper, we develop nonasymptotic concentration results for such an adaptive Huber estimator, namely, the Huber estimator with the tuning parameter adapted to sample size, dimension, and the variance of the noise. Specifically, we obtain a sub-Gaussian-type deviation inequality and a nonasymptotic Bahadur representation when noise variables only have finite second moments. The nonasymptotic results further yield two conventional normal approximation results that are of independent interest, the Berry-Esseen inequality and Cramér-type moderate deviation. As an important application to large-scale simultaneous inference, we apply these robust normal approximation results to analyze a dependence-adjusted multiple testing procedure for moderately heavy-tailed data. It is shown that the robust dependence-adjusted procedure asymptotically controls the overall false discovery proportion at the nominal level under mild moment conditions. Thorough numerical results on both simulated and real datasets are also provided to back up our theory.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"46 5","pages":"1904-1931"},"PeriodicalIF":4.5,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6133288/pdf/nihms926033.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36491731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ANALYSIS OF "LEARN-AS-YOU-GO" (LAGO) STUDIES. “随学随走”(lago)研究分析。
IF 4.5 1区 数学
Annals of Statistics Pub Date : 2018-08-20 DOI: 10.1214/20-AOS1978
D. Nevo, J. Lok, D. Spiegelman
{"title":"ANALYSIS OF \"LEARN-AS-YOU-GO\" (LAGO) STUDIES.","authors":"D. Nevo, J. Lok, D. Spiegelman","doi":"10.1214/20-AOS1978","DOIUrl":"https://doi.org/10.1214/20-AOS1978","url":null,"abstract":"In Learn-As-you-GO (LAGO) adaptive studies, the intervention is a complex multicomponent package, and is adapted in stages during the study based on past outcome data. This design formalizes standard practice in public health intervention studies. An effective intervention package is sought, while minimizing intervention package cost. In LAGO study data, the interventions in later stages depend upon the outcomes in the previous stages, violating standard statistical theory. We develop an estimator for the intervention effects, and prove consistency and asymptotic normality using a novel coupling argument, ensuring the validity of the test for the hypothesis of no overall intervention effect. We develop a confidence set for the optimal intervention package and confidence bands for the success probabilities under alternative package compositions. We illustrate our methods in the BetterBirth Study, which aimed to improve maternal and neonatal outcomes among 157,689 births in Uttar Pradesh, India through a multicomponent intervention package.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"49 2 1","pages":"793-819"},"PeriodicalIF":4.5,"publicationDate":"2018-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43532425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
LARGE COVARIANCE ESTIMATION THROUGH ELLIPTICAL FACTOR MODELS. 通过椭圆因子模型的大协方差估计。
IF 4.5 1区 数学
Annals of Statistics Pub Date : 2018-08-01 Epub Date: 2018-06-27 DOI: 10.1214/17-AOS1588
Jianqing Fan, Han Liu, Weichen Wang
{"title":"LARGE COVARIANCE ESTIMATION THROUGH ELLIPTICAL FACTOR MODELS.","authors":"Jianqing Fan,&nbsp;Han Liu,&nbsp;Weichen Wang","doi":"10.1214/17-AOS1588","DOIUrl":"10.1214/17-AOS1588","url":null,"abstract":"<p><p>We propose a general Principal Orthogonal complEment Thresholding (POET) framework for large-scale covariance matrix estimation based on the approximate factor model. A set of high level sufficient conditions for the procedure to achieve optimal rates of convergence under different matrix norms is established to better understand how POET works. Such a framework allows us to recover existing results for sub-Gaussian data in a more transparent way that only depends on the concentration properties of the sample covariance matrix. As a new theoretical contribution, for the first time, such a framework allows us to exploit conditional sparsity covariance structure for the heavy-tailed data. In particular, for the elliptical distribution, we propose a robust estimator based on the marginal and spatial Kendall's tau to satisfy these conditions. In addition, we study conditional graphical model under the same framework. The technical tools developed in this paper are of general interest to high dimensional principal component analysis. Thorough numerical results are also provided to back up the developed theory.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"46 4","pages":"1383-1414"},"PeriodicalIF":4.5,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1214/17-AOS1588","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36490928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 84
Consistency and convergence rate of phylogenetic inference via regularization. 基于正则化的系统发育推理的一致性和收敛率。
IF 4.5 1区 数学
Annals of Statistics Pub Date : 2018-08-01 Epub Date: 2018-06-27 DOI: 10.1214/17-AOS1592
Vu Dinh, Lam Si Tung Ho, Marc A Suchard, Frederick A Matsen
{"title":"Consistency and convergence rate of phylogenetic inference via regularization.","authors":"Vu Dinh,&nbsp;Lam Si Tung Ho,&nbsp;Marc A Suchard,&nbsp;Frederick A Matsen","doi":"10.1214/17-AOS1592","DOIUrl":"https://doi.org/10.1214/17-AOS1592","url":null,"abstract":"It is common in phylogenetics to have some, perhaps partial, information about the overall evolutionary tree of a group of organisms and wish to find an evolutionary tree of a specific gene for those organisms. There may not be enough information in the gene sequences alone to accurately reconstruct the correct \"gene tree.\" Although the gene tree may deviate from the \"species tree\" due to a variety of genetic processes, in the absence of evidence to the contrary it is parsimonious to assume that they agree. A common statistical approach in these situations is to develop a likelihood penalty to incorporate such additional information. Recent studies using simulation and empirical data suggest that a likelihood penalty quantifying concordance with a species tree can significantly improve the accuracy of gene tree reconstruction compared to using sequence data alone. However, the consistency of such an approach has not yet been established, nor have convergence rates been bounded. Because phylogenetics is a non-standard inference problem, the standard theory does not apply. In this paper, we propose a penalized maximum likelihood estimator for gene tree reconstruction, where the penalty is the square of the Billera-Holmes-Vogtmann geodesic distance from the gene tree to the species tree. We prove that this method is consistent, and derive its convergence rate for estimating the discrete gene tree structure and continuous edge lengths (representing the amount of evolution that has occurred on that branch) simultaneously. We find that the regularized estimator is \"adaptive fast converging,\" meaning that it can reconstruct all edges of length greater than any given threshold from gene sequences of polynomial length. Our method does not require the species tree to be known exactly; in fact, our asymptotic theory holds for any such guide tree.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"46 4","pages":"1481-1512"},"PeriodicalIF":4.5,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1214/17-AOS1592","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36592809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model. 尖峰协方差模型中特征值的最优收缩。
IF 4.5 1区 数学
Annals of Statistics Pub Date : 2018-08-01 Epub Date: 2018-06-27 DOI: 10.1214/17-AOS1601
David L Donoho, Matan Gavish, Iain M Johnstone
{"title":"Optimal Shrinkage of Eigenvalues in the Spiked Covariance Model.","authors":"David L Donoho,&nbsp;Matan Gavish,&nbsp;Iain M Johnstone","doi":"10.1214/17-AOS1601","DOIUrl":"10.1214/17-AOS1601","url":null,"abstract":"<p><p>We show that in a common high-dimensional covariance model, the choice of loss function has a profound effect on optimal estimation. In an asymptotic framework based on the Spiked Covariance model and use of orthogonally invariant estimators, we show that optimal estimation of the population covariance matrix boils down to design of an optimal shrinker <i>η</i> that acts elementwise on the sample eigenvalues. Indeed, to each loss function there corresponds a unique admissible eigenvalue shrinker <i>η</i>* dominating all other shrinkers. The shape of the optimal shrinker is determined by the choice of loss function and, crucially, by inconsistency of both eigenvalues <i>and</i> eigenvectors of the sample covariance matrix. Details of these phenomena and closed form formulas for the optimal eigenvalue shrinkers are worked out for a menagerie of 26 loss functions for covariance estimation found in the literature, including the Stein, Entropy, Divergence, Fréchet, Bhattacharya/Matusita, Frobenius Norm, Operator Norm, Nuclear Norm and Condition Number losses.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"46 4","pages":"1742-1778"},"PeriodicalIF":4.5,"publicationDate":"2018-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1214/17-AOS1601","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36527362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 181
BALL DIVERGENCE: NONPARAMETRIC TWO SAMPLE TEST. 球散度:非参数双样本检验。
IF 4.5 1区 数学
Annals of Statistics Pub Date : 2018-06-01 DOI: 10.1214/17-AOS1579
Wenliang Pan, Yuan Tian, Xueqin Wang, Heping Zhang
{"title":"BALL DIVERGENCE: NONPARAMETRIC TWO SAMPLE TEST.","authors":"Wenliang Pan,&nbsp;Yuan Tian,&nbsp;Xueqin Wang,&nbsp;Heping Zhang","doi":"10.1214/17-AOS1579","DOIUrl":"10.1214/17-AOS1579","url":null,"abstract":"<p><p>In this paper, we first introduce Ball Divergence, a novel measure of the difference between two probability measures in separable Banach spaces, and show that the Ball Divergence of two probability measures is zero if and only if these two probability measures are identical without any moment assumption. Using Ball Divergence, we present a metric rank test procedure to detect the equality of distribution measures underlying independent samples. It is therefore robust to outliers or heavy-tail data. We show that this multivariate two sample test statistic is consistent with the Ball Divergence, and it converges to a mixture of χ<sup>2</sup> distributions under the null hypothesis and a normal distribution under the alternative hypothesis. Importantly, we prove its consistency against a general alternative hypothesis. Moreover, this result does not depend on the ratio of the two imbalanced sample sizes, ensuring that can be applied to imbalanced data. Numerical studies confirm that our test is superior to several existing tests in terms of Type I error and power. We conclude our paper with two applications of our method: one is for virtual screening in drug development process and the other is for genome wide expression analysis in hormone replacement therapy.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"46 3","pages":"1109-1137"},"PeriodicalIF":4.5,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1214/17-AOS1579","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36592808","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信