评估中位p值方法，用于评估多重输入时检验的统计显著性。

IF 1.1 4区数学 Q2 STATISTICS & PROBABILITY

Journal of Applied Statistics Pub Date : 2024-10-25 eCollection Date: 2025-01-01 DOI:10.1080/02664763.2024.2418473

Peter C Austin, Iris Eekhout, Stef van Buuren

{"title":"评估中位p值方法，用于评估多重输入时检验的统计显著性。","authors":"Peter C Austin, Iris Eekhout, Stef van Buuren","doi":"10.1080/02664763.2024.2418473","DOIUrl":null,"url":null,"abstract":"Rubin's Rules are commonly used to pool the results of statistical analyses across imputed samples when using multiple imputation. Rubin's Rules cannot be used when the result of an analysis in an imputed dataset is not a statistic and its associated standard error, but a test statistic (e.g. Student's t-test). While complex methods have been proposed for pooling test statistics across imputed samples, these methods have not been implemented in many popular statistical software packages. The median p-value method has been proposed for pooling test statistics. The statistical significance level of the pooled test statistic is the median of the associated p-values across the imputed samples. We evaluated the performance of this method with nine statistical tests: Student's t-test, Wilcoxon Rank Sum test, Analysis of Variance, Kruskal-Wallis test, the test of significance for Pearson's and Spearman's correlation coefficient, the Chi-squared test, the test of significance for a regression coefficient from a linear regression and from a logistic regression. For each test, the empirical type I error rate was higher than the advertised rate. The magnitude of inflation increased as the prevalence of missing data increased. The median p-value method should not be used to assess statistical significance across imputed datasets.","PeriodicalId":15239,"journal":{"name":"Journal of Applied Statistics","volume":"52 6","pages":"1161-1176"},"PeriodicalIF":1.1000,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12035927/pdf/","citationCount":"0","resultStr":"{\"title\":\"Evaluating the median p-value method for assessing the statistical significance of tests when using multiple imputation.\",\"authors\":\"Peter C Austin, Iris Eekhout, Stef van Buuren\",\"doi\":\"10.1080/02664763.2024.2418473\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Rubin's Rules are commonly used to pool the results of statistical analyses across imputed samples when using multiple imputation. Rubin's Rules cannot be used when the result of an analysis in an imputed dataset is not a statistic and its associated standard error, but a test statistic (e.g. Student's t-test). While complex methods have been proposed for pooling test statistics across imputed samples, these methods have not been implemented in many popular statistical software packages. The median p-value method has been proposed for pooling test statistics. The statistical significance level of the pooled test statistic is the median of the associated p-values across the imputed samples. We evaluated the performance of this method with nine statistical tests: Student's t-test, Wilcoxon Rank Sum test, Analysis of Variance, Kruskal-Wallis test, the test of significance for Pearson's and Spearman's correlation coefficient, the Chi-squared test, the test of significance for a regression coefficient from a linear regression and from a logistic regression. For each test, the empirical type I error rate was higher than the advertised rate. The magnitude of inflation increased as the prevalence of missing data increased. The median p-value method should not be used to assess statistical significance across imputed datasets.\",\"PeriodicalId\":15239,\"journal\":{\"name\":\"Journal of Applied Statistics\",\"volume\":\"52 6\",\"pages\":\"1161-1176\"},\"PeriodicalIF\":1.1000,\"publicationDate\":\"2024-10-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12035927/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Applied Statistics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1080/02664763.2024.2418473\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q2\",\"JCRName\":\"STATISTICS & PROBABILITY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1080/02664763.2024.2418473","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 0

摘要

鲁宾规则通常用于汇集跨估算样本的统计分析结果，当使用多重估算时。当输入数据集中的分析结果不是统计量及其相关的标准误差，而是检验统计量（例如学生t检验）时，Rubin规则不能使用。虽然已经提出了复杂的方法来汇集跨输入样本的测试统计，但这些方法尚未在许多流行的统计软件包中实现。中位数p值法已被提出用于池化检验统计量。合并检验统计量的统计显著性水平是整个输入样本的相关p值的中位数。我们用九项统计检验来评价该方法的性能：学生t检验、Wilcoxon秩和检验、方差分析、Kruskal-Wallis检验、Pearson和Spearman相关系数的显著性检验、卡方检验、线性回归和逻辑回归回归系数的显著性检验。对于每个测试，经验I型错误率高于广告率。通货膨胀的程度随着缺失数据的增加而增加。中位数p值法不应用于评估跨输入数据集的统计显著性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Evaluating the median p-value method for assessing the statistical significance of tests when using multiple imputation.

查看原文本刊更多论文

Evaluating the median p-value method for assessing the statistical significance of tests when using multiple imputation.

Rubin's Rules are commonly used to pool the results of statistical analyses across imputed samples when using multiple imputation. Rubin's Rules cannot be used when the result of an analysis in an imputed dataset is not a statistic and its associated standard error, but a test statistic (e.g. Student's t-test). While complex methods have been proposed for pooling test statistics across imputed samples, these methods have not been implemented in many popular statistical software packages. The median p-value method has been proposed for pooling test statistics. The statistical significance level of the pooled test statistic is the median of the associated p-values across the imputed samples. We evaluated the performance of this method with nine statistical tests: Student's t-test, Wilcoxon Rank Sum test, Analysis of Variance, Kruskal-Wallis test, the test of significance for Pearson's and Spearman's correlation coefficient, the Chi-squared test, the test of significance for a regression coefficient from a linear regression and from a logistic regression. For each test, the empirical type I error rate was higher than the advertised rate. The magnitude of inflation increased as the prevalence of missing data increased. The median p-value method should not be used to assess statistical significance across imputed datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Applied Statistics 数学-统计学与概率论

CiteScore

3.40

自引率

0.00%

发文量

126

审稿时长

6 months

期刊介绍： Journal of Applied Statistics provides a forum for communication between both applied statisticians and users of applied statistical techniques across a wide range of disciplines. These areas include business, computing, economics, ecology, education, management, medicine, operational research and sociology, but papers from other areas are also considered. The editorial policy is to publish rigorous but clear and accessible papers on applied techniques. Purely theoretical papers are avoided but those on theoretical developments which clearly demonstrate significant applied potential are welcomed. Each paper is submitted to at least two independent referees.