球散度:非参数双样本检验。

IF 3.2 1区 数学 Q1 STATISTICS & PROBABILITY
Wenliang Pan, Yuan Tian, Xueqin Wang, Heping Zhang
{"title":"球散度:非参数双样本检验。","authors":"Wenliang Pan,&nbsp;Yuan Tian,&nbsp;Xueqin Wang,&nbsp;Heping Zhang","doi":"10.1214/17-AOS1579","DOIUrl":null,"url":null,"abstract":"<p><p>In this paper, we first introduce Ball Divergence, a novel measure of the difference between two probability measures in separable Banach spaces, and show that the Ball Divergence of two probability measures is zero if and only if these two probability measures are identical without any moment assumption. Using Ball Divergence, we present a metric rank test procedure to detect the equality of distribution measures underlying independent samples. It is therefore robust to outliers or heavy-tail data. We show that this multivariate two sample test statistic is consistent with the Ball Divergence, and it converges to a mixture of χ<sup>2</sup> distributions under the null hypothesis and a normal distribution under the alternative hypothesis. Importantly, we prove its consistency against a general alternative hypothesis. Moreover, this result does not depend on the ratio of the two imbalanced sample sizes, ensuring that can be applied to imbalanced data. Numerical studies confirm that our test is superior to several existing tests in terms of Type I error and power. We conclude our paper with two applications of our method: one is for virtual screening in drug development process and the other is for genome wide expression analysis in hormone replacement therapy.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"46 3","pages":"1109-1137"},"PeriodicalIF":3.2000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1214/17-AOS1579","citationCount":"41","resultStr":"{\"title\":\"BALL DIVERGENCE: NONPARAMETRIC TWO SAMPLE TEST.\",\"authors\":\"Wenliang Pan,&nbsp;Yuan Tian,&nbsp;Xueqin Wang,&nbsp;Heping Zhang\",\"doi\":\"10.1214/17-AOS1579\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>In this paper, we first introduce Ball Divergence, a novel measure of the difference between two probability measures in separable Banach spaces, and show that the Ball Divergence of two probability measures is zero if and only if these two probability measures are identical without any moment assumption. Using Ball Divergence, we present a metric rank test procedure to detect the equality of distribution measures underlying independent samples. It is therefore robust to outliers or heavy-tail data. We show that this multivariate two sample test statistic is consistent with the Ball Divergence, and it converges to a mixture of χ<sup>2</sup> distributions under the null hypothesis and a normal distribution under the alternative hypothesis. Importantly, we prove its consistency against a general alternative hypothesis. Moreover, this result does not depend on the ratio of the two imbalanced sample sizes, ensuring that can be applied to imbalanced data. Numerical studies confirm that our test is superior to several existing tests in terms of Type I error and power. We conclude our paper with two applications of our method: one is for virtual screening in drug development process and the other is for genome wide expression analysis in hormone replacement therapy.</p>\",\"PeriodicalId\":8032,\"journal\":{\"name\":\"Annals of Statistics\",\"volume\":\"46 3\",\"pages\":\"1109-1137\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2018-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1214/17-AOS1579\",\"citationCount\":\"41\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Annals of Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1214/17-AOS1579\",\"RegionNum\":1,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/17-AOS1579","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 41

摘要

在本文中,我们首先引入了可分离Banach空间中两个概率测度之差的一个新测度Ball散度,并证明了两个概率度量的Ball散度为零,当且仅当这两个概率量度在没有任何矩假设的情况下是相同的。使用Ball Divergence,我们提出了一个度量秩检验程序来检测独立样本下分布测度的相等性。因此,它对异常值或重尾数据是稳健的。我们证明了这个多变量两样本检验统计量与Ball散度一致,并且它在零假设下收敛于χ2分布的混合,在替代假设下收敛为正态分布。重要的是,我们证明了它与一般替代假设的一致性。此外,这一结果不取决于两个不平衡样本大小的比率,确保了这一点可以应用于不平衡数据。数值研究证实,我们的测试在I型误差和功率方面优于现有的几种测试。最后,我们将我们的方法应用于两个方面:一是用于药物开发过程中的虚拟筛选,二是用于激素替代治疗中的全基因组表达分析。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

BALL DIVERGENCE: NONPARAMETRIC TWO SAMPLE TEST.

BALL DIVERGENCE: NONPARAMETRIC TWO SAMPLE TEST.

In this paper, we first introduce Ball Divergence, a novel measure of the difference between two probability measures in separable Banach spaces, and show that the Ball Divergence of two probability measures is zero if and only if these two probability measures are identical without any moment assumption. Using Ball Divergence, we present a metric rank test procedure to detect the equality of distribution measures underlying independent samples. It is therefore robust to outliers or heavy-tail data. We show that this multivariate two sample test statistic is consistent with the Ball Divergence, and it converges to a mixture of χ2 distributions under the null hypothesis and a normal distribution under the alternative hypothesis. Importantly, we prove its consistency against a general alternative hypothesis. Moreover, this result does not depend on the ratio of the two imbalanced sample sizes, ensuring that can be applied to imbalanced data. Numerical studies confirm that our test is superior to several existing tests in terms of Type I error and power. We conclude our paper with two applications of our method: one is for virtual screening in drug development process and the other is for genome wide expression analysis in hormone replacement therapy.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Annals of Statistics
Annals of Statistics 数学-统计学与概率论
CiteScore
9.30
自引率
8.90%
发文量
119
审稿时长
6-12 weeks
期刊介绍: The Annals of Statistics aim to publish research papers of highest quality reflecting the many facets of contemporary statistics. Primary emphasis is placed on importance and originality, not on formalism. The journal aims to cover all areas of statistics, especially mathematical statistics and applied & interdisciplinary statistics. Of course many of the best papers will touch on more than one of these general areas, because the discipline of statistics has deep roots in mathematics, and in substantive scientific fields.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信