Xiuyuan Cheng, Alexander Cloninger, Ronald R Coifman
{"title":"Two-sample statistics based on anisotropic kernels.","authors":"Xiuyuan Cheng, Alexander Cloninger, Ronald R Coifman","doi":"10.1093/imaiai/iaz018","DOIUrl":null,"url":null,"abstract":"<p><p>The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely many multivariate samples. When the distributions are locally low-dimensional, the proposed test can be made more powerful to distinguish certain alternatives by incorporating local covariance matrices and constructing an anisotropic kernel. The kernel matrix is asymmetric; it computes the affinity between [Formula: see text] data points and a set of [Formula: see text] reference points, where [Formula: see text] can be drastically smaller than [Formula: see text]. While the proposed statistic can be viewed as a special class of Reproducing Kernel Hilbert Space MMD, the consistency of the test is proved, under mild assumptions of the kernel, as long as [Formula: see text], and a finite-sample lower bound of the testing power is obtained. Applications to flow cytometry and diffusion MRI datasets are demonstrated, which motivate the proposed approach to compare distributions.</p>","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":"9 3","pages":"677-719"},"PeriodicalIF":1.4000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1093/imaiai/iaz018","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Inference-A Journal of the Ima","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/imaiai/iaz018","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2019/12/10 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 16
Abstract
The paper introduces a new kernel-based Maximum Mean Discrepancy (MMD) statistic for measuring the distance between two distributions given finitely many multivariate samples. When the distributions are locally low-dimensional, the proposed test can be made more powerful to distinguish certain alternatives by incorporating local covariance matrices and constructing an anisotropic kernel. The kernel matrix is asymmetric; it computes the affinity between [Formula: see text] data points and a set of [Formula: see text] reference points, where [Formula: see text] can be drastically smaller than [Formula: see text]. While the proposed statistic can be viewed as a special class of Reproducing Kernel Hilbert Space MMD, the consistency of the test is proved, under mild assumptions of the kernel, as long as [Formula: see text], and a finite-sample lower bound of the testing power is obtained. Applications to flow cytometry and diffusion MRI datasets are demonstrated, which motivate the proposed approach to compare distributions.