{"title":"基于几何图的两样本检验的渐近分布和检测阈值","authors":"B. Bhattacharya","doi":"10.1214/19-AOS1913","DOIUrl":null,"url":null,"abstract":"In this paper we consider the problem of testing the equality of two multivariate distributions based on geometric graphs, constructed using the inter-point distances between the observations. These include the test based on the minimum spanning tree and the K-nearest neighbor (NN) graphs, among others. These tests are asymptotically distribution-free, universally consistent, and computationally efficient, making them particularly useful in modern applications. However, very little is known about the power properties of these tests. In this paper, using theory of stabilizing geometric graphs, we derive the asymptotic distribution of these tests under general alternatives, in the Poissonized setting. Using this, the detection threshold and the limiting local power of the test based on the K-NN graph are obtained, where interesting exponents depending on dimension emerge. This provides a way to compare and justify the performance of these tests in different examples.","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"48 1","pages":"2879-2903"},"PeriodicalIF":3.2000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Asymptotic distribution and detection thresholds for two-sample tests based on geometric graphs\",\"authors\":\"B. Bhattacharya\",\"doi\":\"10.1214/19-AOS1913\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we consider the problem of testing the equality of two multivariate distributions based on geometric graphs, constructed using the inter-point distances between the observations. These include the test based on the minimum spanning tree and the K-nearest neighbor (NN) graphs, among others. These tests are asymptotically distribution-free, universally consistent, and computationally efficient, making them particularly useful in modern applications. However, very little is known about the power properties of these tests. In this paper, using theory of stabilizing geometric graphs, we derive the asymptotic distribution of these tests under general alternatives, in the Poissonized setting. Using this, the detection threshold and the limiting local power of the test based on the K-NN graph are obtained, where interesting exponents depending on dimension emerge. This provides a way to compare and justify the performance of these tests in different examples.\",\"PeriodicalId\":8032,\"journal\":{\"name\":\"Annals of Statistics\",\"volume\":\"48 1\",\"pages\":\"2879-2903\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2020-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1214/19-AOS1913\",\"RegionNum\":1,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/19-AOS1913","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
Asymptotic distribution and detection thresholds for two-sample tests based on geometric graphs
In this paper we consider the problem of testing the equality of two multivariate distributions based on geometric graphs, constructed using the inter-point distances between the observations. These include the test based on the minimum spanning tree and the K-nearest neighbor (NN) graphs, among others. These tests are asymptotically distribution-free, universally consistent, and computationally efficient, making them particularly useful in modern applications. However, very little is known about the power properties of these tests. In this paper, using theory of stabilizing geometric graphs, we derive the asymptotic distribution of these tests under general alternatives, in the Poissonized setting. Using this, the detection threshold and the limiting local power of the test based on the K-NN graph are obtained, where interesting exponents depending on dimension emerge. This provides a way to compare and justify the performance of these tests in different examples.
期刊介绍:
The Annals of Statistics aim to publish research papers of highest quality reflecting the many facets of contemporary statistics. Primary emphasis is placed on importance and originality, not on formalism. The journal aims to cover all areas of statistics, especially mathematical statistics and applied & interdisciplinary statistics. Of course many of the best papers will touch on more than one of these general areas, because the discipline of statistics has deep roots in mathematics, and in substantive scientific fields.