Sensitivity Analysis for Binary Outcome Misclassification in Randomization Tests via Integer Programming.

IF 1.8 2区数学 Q2 STATISTICS & PROBABILITY

Journal of Computational and Graphical Statistics Pub Date : 2025-04-17 DOI:10.1080/10618600.2025.2461222

Siyu Heng, Pamela A Shaw

{"title":"Sensitivity Analysis for Binary Outcome Misclassification in Randomization Tests via Integer Programming.","authors":"Siyu Heng, Pamela A Shaw","doi":"10.1080/10618600.2025.2461222","DOIUrl":null,"url":null,"abstract":"<p><p>Conducting a randomization test is a common method for testing causal null hypotheses in randomized experiments. The popularity of randomization tests is largely because their statistical validity only depends on the randomization design, and no distributional or modeling assumption on the outcome variable is needed. However, randomization tests may still suffer from other sources of bias, among which outcome misclassification is a significant one. We propose a model-free and finite-population sensitivity analysis approach for binary outcome misclassification in randomization tests. A central quantity in our framework is \"warning accuracy,\" defined as the threshold such that a randomization test result based on the measured outcomes may differ from that based on the true outcomes if the outcome measurement accuracy did not surpass that threshold. We show how learning the warning accuracy and related concepts can amplify analyses of randomization tests subject to outcome misclassification without adding additional assumptions. We show that the warning accuracy can be computed efficiently for large data sets by adaptively reformulating a large-scale integer program with respect to the randomization design. We apply the proposed approach to the Prostate Cancer Prevention Trial (PCPT). We also developed an open-source R package for implementation of our approach.</p>","PeriodicalId":15422,"journal":{"name":"Journal of Computational and Graphical Statistics","volume":" ","pages":""},"PeriodicalIF":1.8000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12377470/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational and Graphical Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1080/10618600.2025.2461222","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 0

Abstract

Conducting a randomization test is a common method for testing causal null hypotheses in randomized experiments. The popularity of randomization tests is largely because their statistical validity only depends on the randomization design, and no distributional or modeling assumption on the outcome variable is needed. However, randomization tests may still suffer from other sources of bias, among which outcome misclassification is a significant one. We propose a model-free and finite-population sensitivity analysis approach for binary outcome misclassification in randomization tests. A central quantity in our framework is "warning accuracy," defined as the threshold such that a randomization test result based on the measured outcomes may differ from that based on the true outcomes if the outcome measurement accuracy did not surpass that threshold. We show how learning the warning accuracy and related concepts can amplify analyses of randomization tests subject to outcome misclassification without adding additional assumptions. We show that the warning accuracy can be computed efficiently for large data sets by adaptively reformulating a large-scale integer program with respect to the randomization design. We apply the proposed approach to the Prostate Cancer Prevention Trial (PCPT). We also developed an open-source R package for implementation of our approach.

查看原文本刊更多论文

基于整数规划的随机化试验二元结果错分类敏感性分析。

进行随机化检验是随机实验中检验因果零假设的常用方法。随机化检验的流行很大程度上是因为其统计有效性仅取决于随机化设计，而不需要对结果变量进行分布或建模假设。然而，随机化试验仍然可能受到其他偏倚来源的影响，其中结果错误分类是一个重要的偏倚来源。我们提出了一种无模型和有限群体敏感性分析方法，用于随机化试验中二元结果的错误分类。我们框架中的一个中心量是“警告准确性”，定义为这样一个阈值，即如果结果测量精度没有超过该阈值，基于测量结果的随机化测试结果可能与基于真实结果的随机化测试结果不同。我们展示了学习警告准确性和相关概念如何在不增加额外假设的情况下，扩大对结果错误分类的随机化测试的分析。通过对随机化设计进行自适应重构，可以有效地计算大型数据集的预警精度。我们将提出的方法应用于前列腺癌预防试验（PCPT）。我们还开发了一个开源R包来实现我们的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Computational and Graphical Statistics 数学-统计学与概率论

CiteScore

3.50

自引率

8.30%

发文量

153

审稿时长

>12 weeks

期刊介绍： The Journal of Computational and Graphical Statistics (JCGS) presents the very latest techniques on improving and extending the use of computational and graphical methods in statistics and data analysis. Established in 1992, this journal contains cutting-edge research, data, surveys, and more on numerical graphical displays and methods, and perception. Articles are written for readers who have a strong background in statistics but are not necessarily experts in computing. Published in March, June, September, and December.