arXiv - MATH - Statistics Theory最新文献

Precision-based designs for sequential randomized experiments 基于精确度的顺序随机实验设计

arXiv - MATH - Statistics Theory Pub Date : 2024-05-06 DOI: arxiv-2405.03487

Mattias Nordin, Mårten Schultzberg

{"title":"Precision-based designs for sequential randomized experiments","authors":"Mattias Nordin, Mårten Schultzberg","doi":"arxiv-2405.03487","DOIUrl":"https://doi.org/arxiv-2405.03487","url":null,"abstract":"In this paper, we consider an experimental setting where units enter the\u0000experiment sequentially. Our goal is to form stopping rules which lead to\u0000estimators of treatment effects with a given precision. We propose a\u0000fixed-width confidence interval design (FWCID) where the experiment terminates\u0000once a pre-specified confidence interval width is achieved. We show that under\u0000this design, the difference-in-means estimator is a consistent estimator of the\u0000average treatment effect and standard confidence intervals have asymptotic\u0000guarantees of coverage and efficiency for several versions of the design. In\u0000addition, we propose a version of the design that we call fixed power design\u0000(FPD) where a given power is asymptotically guaranteed for a given treatment\u0000effect, without the need to specify the variances of the outcomes under\u0000treatment or control. In addition, this design also gives a consistent\u0000difference-in-means estimator with correct coverage of the corresponding\u0000standard confidence interval. We complement our theoretical findings with Monte\u0000Carlo simulations where we compare our proposed designs with standard designs\u0000in the sequential experiments literature, showing that our designs outperform\u0000these designs in several important aspects. We believe our results to be\u0000relevant for many experimental settings where units enter sequentially, such as\u0000in clinical trials, as well as in online A/B tests used by the tech and\u0000e-commerce industry.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Strang Splitting for Parametric Inference in Second-order Stochastic Differential Equations 二阶随机微分方程中参数推理的 Strang Splitting

arXiv - MATH - Statistics Theory Pub Date : 2024-05-06 DOI: arxiv-2405.03606

Predrag Pilipovic, Adeline Samson, Susanne Ditlevsen

{"title":"Strang Splitting for Parametric Inference in Second-order Stochastic Differential Equations","authors":"Predrag Pilipovic, Adeline Samson, Susanne Ditlevsen","doi":"arxiv-2405.03606","DOIUrl":"https://doi.org/arxiv-2405.03606","url":null,"abstract":"We address parameter estimation in second-order stochastic differential\u0000equations (SDEs), prevalent in physics, biology, and ecology. Second-order SDE\u0000is converted to a first-order system by introducing an auxiliary velocity\u0000variable raising two main challenges. First, the system is hypoelliptic since\u0000the noise affects only the velocity, making the Euler-Maruyama estimator\u0000ill-conditioned. To overcome that, we propose an estimator based on the Strang\u0000splitting scheme. Second, since the velocity is rarely observed we adjust the\u0000estimator for partial observations. We present four estimators for complete and\u0000partial observations, using full likelihood or only velocity marginal\u0000likelihood. These estimators are intuitive, easy to implement, and\u0000computationally fast, and we prove their consistency and asymptotic normality.\u0000Our analysis demonstrates that using full likelihood with complete observations\u0000reduces the asymptotic variance of the diffusion estimator. With partial\u0000observations, the asymptotic variance increases due to information loss but\u0000remains unaffected by the likelihood choice. However, a numerical study on the\u0000Kramers oscillator reveals that using marginal likelihood for partial\u0000observations yields less biased estimators. We apply our approach to\u0000paleoclimate data from the Greenland ice core and fit it to the Kramers\u0000oscillator model, capturing transitions between metastable states reflecting\u0000observed climatic conditions during glacial eras.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"238 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Stability of a Generalized Debiased Lasso with Applications to Resampling-Based Variable Selection 广义去偏拉索的稳定性及其在基于重采样的变量选择中的应用

arXiv - MATH - Statistics Theory Pub Date : 2024-05-05 DOI: arxiv-2405.03063

Jingbo Liu

{"title":"Stability of a Generalized Debiased Lasso with Applications to Resampling-Based Variable Selection","authors":"Jingbo Liu","doi":"arxiv-2405.03063","DOIUrl":"https://doi.org/arxiv-2405.03063","url":null,"abstract":"Suppose that we first apply the Lasso to a design matrix, and then update one\u0000of its columns. In general, the signs of the Lasso coefficients may change, and\u0000there is no closed-form expression for updating the Lasso solution exactly. In\u0000this work, we propose an approximate formula for updating a debiased Lasso\u0000coefficient. We provide general nonasymptotic error bounds in terms of the\u0000norms and correlations of a given design matrix's columns, and then prove\u0000asymptotic convergence results for the case of a random design matrix with\u0000i.i.d. sub-Gaussian row vectors and i.i.d. Gaussian noise. Notably, the\u0000approximate formula is asymptotically correct for most coordinates in the\u0000proportional growth regime, under the mild assumption that each row of the\u0000design matrix is sub-Gaussian with a covariance matrix having a bounded\u0000condition number. Our proof only requires certain concentration and\u0000anti-concentration properties to control various error terms and the number of\u0000sign changes. In contrast, rigorously establishing distributional limit\u0000properties (e.g. Gaussian limits for the debiased Lasso) under similarly\u0000general assumptions has been considered open problem in the universality\u0000theory. As applications, we show that the approximate formula allows us to\u0000reduce the computation complexity of variable selection algorithms that require\u0000solving multiple Lasso problems, such as the conditional randomization test and\u0000a variant of the knockoff filter.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"118 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Limiting Behavior of Maxima under Dependence 依赖性下最大值的极限行为

arXiv - MATH - Statistics Theory Pub Date : 2024-05-05 DOI: arxiv-2405.02833

Klaus Herrmann, Marius Hofert, Johanna G. Neslehova

引用次数: 0

Probabilistic cellular automata with local transition matrices: synchronization, ergodicity, and inference 具有局部过渡矩阵的概率蜂窝自动机：同步、遍历性和推理

arXiv - MATH - Statistics Theory Pub Date : 2024-05-05 DOI: arxiv-2405.02928

Erhan Bayrakta, Fei Lu, Mauro Maggioni, Ruoyu Wu, Sichen Yang

引用次数: 0

Tuning parameter selection in econometrics 调整计量经济学中的参数选择

arXiv - MATH - Statistics Theory Pub Date : 2024-05-05 DOI: arxiv-2405.03021

Denis Chetverikov

引用次数: 0

Negative Probability 负概率

arXiv - MATH - Statistics Theory Pub Date : 2024-05-05 DOI: arxiv-2405.03043

Nick Polson, Vadim Sokolov

引用次数: 0

Unscented Trajectory Optimization 无色轨迹优化

arXiv - MATH - Statistics Theory Pub Date : 2024-05-04 DOI: arxiv-2405.02753

I. M. Ross, R. J. Proulx, M. Karpenko

{"title":"Unscented Trajectory Optimization","authors":"I. M. Ross, R. J. Proulx, M. Karpenko","doi":"arxiv-2405.02753","DOIUrl":"https://doi.org/arxiv-2405.02753","url":null,"abstract":"In a nutshell, unscented trajectory optimization is the generation of optimal\u0000trajectories through the use of an unscented transform. Although unscented\u0000trajectory optimization was introduced by the authors about a decade ago, it is\u0000reintroduced in this paper as a special instantiation of tychastic optimal\u0000control theory. Tychastic optimal control theory (from textit{Tyche}, the\u0000Greek goddess of chance) avoids the use of a Brownian motion and the resulting\u0000It^{o} calculus even though it uses random variables across the entire\u0000spectrum of a problem formulation. This approach circumvents the enormous\u0000technical and numerical challenges associated with stochastic trajectory\u0000optimization. Furthermore, it is shown how a tychastic optimal control problem\u0000that involves nonlinear transformations of the expectation operator can be\u0000quickly instantiated using an unscented transform. These nonlinear\u0000transformations are particularly useful in managing trajectory dispersions be\u0000it associated with path constraints or targeted values of final-time\u0000conditions. This paper also presents a systematic and rapid process for\u0000formulating and computing the most desirable tychastic trajectory using an\u0000unscented transform. Numerical examples are used to illustrate how unscented\u0000trajectory optimization may be used for risk reduction and mission recovery\u0000caused by uncertainties and failures.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"118 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889008","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Power-Enhanced Two-Sample Mean Tests for High-Dimensional Compositional Data with Application to Microbiome Data Analysis 应用于微生物组数据分析的高维组合数据的功率增强型双样本均值检验

arXiv - MATH - Statistics Theory Pub Date : 2024-05-04 DOI: arxiv-2405.02551

Danning Li, Lingzhou Xue, Haoyi Yang, Xiufan Yu

{"title":"Power-Enhanced Two-Sample Mean Tests for High-Dimensional Compositional Data with Application to Microbiome Data Analysis","authors":"Danning Li, Lingzhou Xue, Haoyi Yang, Xiufan Yu","doi":"arxiv-2405.02551","DOIUrl":"https://doi.org/arxiv-2405.02551","url":null,"abstract":"Testing differences in mean vectors is a fundamental task in the analysis of\u0000high-dimensional compositional data. Existing methods may suffer from low power\u0000if the underlying signal pattern is in a situation that does not favor the\u0000deployed test. In this work, we develop two-sample power-enhanced mean tests\u0000for high-dimensional compositional data based on the combination of $p$-values,\u0000which integrates strengths from two popular types of tests: the maximum-type\u0000test and the quadratic-type test. We provide rigorous theoretical guarantees on\u0000the proposed tests, showing accurate Type-I error rate control and enhanced\u0000testing power. Our method boosts the testing power towards a broader\u0000alternative space, which yields robust performance across a wide range of\u0000signal pattern settings. Our theory also contributes to the literature on power\u0000enhancement and Gaussian approximation for high-dimensional hypothesis testing.\u0000We demonstrate the performance of our method on both simulated data and\u0000real-world microbiome data, showing that our proposed approach improves the\u0000testing power substantially compared to existing methods.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140890018","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Grouping predictors via network-wide metrics 通过全网指标对预测因子进行分组

arXiv - MATH - Statistics Theory Pub Date : 2024-05-04 DOI: arxiv-2405.02715

Brandon Woosuk Park, Anand N. Vidyashankar, Tucker S. McElroy

{"title":"Grouping predictors via network-wide metrics","authors":"Brandon Woosuk Park, Anand N. Vidyashankar, Tucker S. McElroy","doi":"arxiv-2405.02715","DOIUrl":"https://doi.org/arxiv-2405.02715","url":null,"abstract":"When multitudes of features can plausibly be associated with a response, both\u0000privacy considerations and model parsimony suggest grouping them to increase\u0000the predictive power of a regression model. Specifically, the identification of\u0000groups of predictors significantly associated with the response variable eases\u0000further downstream analysis and decision-making. This paper proposes a new data\u0000analysis methodology that utilizes the high-dimensional predictor space to\u0000construct an implicit network with weighted edges %and weights on the edges to\u0000identify significant associations between the response and the predictors.\u0000Using a population model for groups of predictors defined via network-wide\u0000metrics, a new supervised grouping algorithm is proposed to determine the\u0000correct group, with probability tending to one as the sample size diverges to\u0000infinity. For this reason, we establish several theoretical properties of the\u0000estimates of network-wide metrics. A novel model-assisted bootstrap procedure\u0000that substantially decreases computational complexity is developed,\u0000facilitating the assessment of uncertainty in the estimates of network-wide\u0000metrics. The proposed methods account for several challenges that arise in the\u0000high-dimensional data setting, including (i) a large number of predictors, (ii)\u0000uncertainty regarding the true statistical model, and (iii) model selection\u0000variability. The performance of the proposed methods is demonstrated through\u0000numerical experiments, data from sports analytics, and breast cancer data.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140889064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0