{"title":"利用置换检验的极低p值近似方法","authors":"Sangseob Leem, T. Park","doi":"10.1109/BIBM.2018.8621082","DOIUrl":null,"url":null,"abstract":"The permutation test is a non-parametric method for assessing statistical significance and this method is widely used in a variety of(many) disciplines including bioinformatics. The permutation test is very useful in situations where a null distribution of test statistics is unknown or hard to determine. In permutation tests, p-values calculated by a proportion of the number of statistical values of randomly shuffled data, where the values are more extreme than, or equal to, statistical values of observed data, among the total number of permutations. In this method, the precision of significance depends on the number of permutations although computation time precludes achieving extremely low p-values.In this paper, we propose a novel strategy for approximating extremely low p-values. If two differently sized data sets show similar patterns, the smaller data set has a higher p-value than the larger one. In other words, dividing data simplifies assessing significances of subsets by a permutation test because of relatively large p-values. P-values of the subsets are then integrated into a final p-value as a meta-analysis. Our proposed method consists of two steps: (1) divide data into subsets and perform permutation tests for the subsets; and (2) integrate p-values by Stouffer’s z-score method. We herein demonstrate and validate our method using simulation studies. Those assessments show that p-values of about 1.0e-20 might (could) be well-estimated by the proposed method in a single day for samples larger than 5,000.","PeriodicalId":108667,"journal":{"name":"2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An approximation method of extremely low p-values using permutation test\",\"authors\":\"Sangseob Leem, T. Park\",\"doi\":\"10.1109/BIBM.2018.8621082\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The permutation test is a non-parametric method for assessing statistical significance and this method is widely used in a variety of(many) disciplines including bioinformatics. The permutation test is very useful in situations where a null distribution of test statistics is unknown or hard to determine. In permutation tests, p-values calculated by a proportion of the number of statistical values of randomly shuffled data, where the values are more extreme than, or equal to, statistical values of observed data, among the total number of permutations. In this method, the precision of significance depends on the number of permutations although computation time precludes achieving extremely low p-values.In this paper, we propose a novel strategy for approximating extremely low p-values. If two differently sized data sets show similar patterns, the smaller data set has a higher p-value than the larger one. In other words, dividing data simplifies assessing significances of subsets by a permutation test because of relatively large p-values. P-values of the subsets are then integrated into a final p-value as a meta-analysis. Our proposed method consists of two steps: (1) divide data into subsets and perform permutation tests for the subsets; and (2) integrate p-values by Stouffer’s z-score method. We herein demonstrate and validate our method using simulation studies. Those assessments show that p-values of about 1.0e-20 might (could) be well-estimated by the proposed method in a single day for samples larger than 5,000.\",\"PeriodicalId\":108667,\"journal\":{\"name\":\"2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBM.2018.8621082\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBM.2018.8621082","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An approximation method of extremely low p-values using permutation test
The permutation test is a non-parametric method for assessing statistical significance and this method is widely used in a variety of(many) disciplines including bioinformatics. The permutation test is very useful in situations where a null distribution of test statistics is unknown or hard to determine. In permutation tests, p-values calculated by a proportion of the number of statistical values of randomly shuffled data, where the values are more extreme than, or equal to, statistical values of observed data, among the total number of permutations. In this method, the precision of significance depends on the number of permutations although computation time precludes achieving extremely low p-values.In this paper, we propose a novel strategy for approximating extremely low p-values. If two differently sized data sets show similar patterns, the smaller data set has a higher p-value than the larger one. In other words, dividing data simplifies assessing significances of subsets by a permutation test because of relatively large p-values. P-values of the subsets are then integrated into a final p-value as a meta-analysis. Our proposed method consists of two steps: (1) divide data into subsets and perform permutation tests for the subsets; and (2) integrate p-values by Stouffer’s z-score method. We herein demonstrate and validate our method using simulation studies. Those assessments show that p-values of about 1.0e-20 might (could) be well-estimated by the proposed method in a single day for samples larger than 5,000.