Chris J Oates, Toni Karvonen, Aretha L Teckentrup, Marina Strocchi, Steven A Niederer
{"title":"Probabilistic Richardson extrapolation.","authors":"Chris J Oates, Toni Karvonen, Aretha L Teckentrup, Marina Strocchi, Steven A Niederer","doi":"10.1093/jrsssb/qkae098","DOIUrl":"https://doi.org/10.1093/jrsssb/qkae098","url":null,"abstract":"<p><p>For over a century, extrapolation methods have provided a powerful tool to improve the convergence order of a numerical method. However, these tools are not well-suited to modern computer codes, where multiple continua are discretized and convergence orders are not easily analysed. To address this challenge, we present a probabilistic perspective on Richardson extrapolation, a point of view that unifies classical extrapolation methods with modern multi-fidelity modelling, and handles uncertain convergence orders by allowing these to be statistically estimated. The approach is developed using Gaussian processes, leading to <i>Gauss-Richardson Extrapolation</i>. Conditions are established under which extrapolation using the conditional mean achieves a polynomial (or even an exponential) speed-up compared to the original numerical method. Further, the probabilistic formulation unlocks the possibility of experimental design, casting the selection of fidelities as a continuous optimization problem, which can then be (approximately) solved. A case study involving a computational cardiac model demonstrates that practical gains in accuracy can be achieved using the GRE method.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"87 2","pages":"457-479"},"PeriodicalIF":3.1,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11985099/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144058583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A focusing framework for testing bi-directional causal effects in Mendelian randomization.","authors":"Sai Li, Ting Ye","doi":"10.1093/jrsssb/qkae101","DOIUrl":"https://doi.org/10.1093/jrsssb/qkae101","url":null,"abstract":"<p><p>Mendelian randomization (MR) is a powerful method that uses genetic variants as instrumental variables to infer the causal effect of a modifiable exposure on an outcome. We study inference for bi-directional causal relationships and causal directions with possibly pleiotropic genetic variants. We show that assumptions for common MR methods are often impossible or too stringent given the potential bi-directional relationships. We propose a new focusing framework for testing bi-directional causal effects and it can be coupled with many state-of-the-art MR methods. We provide theoretical guarantees for our proposal and demonstrate its performance using several simulated and real datasets.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"87 2","pages":"529-548"},"PeriodicalIF":3.1,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11985100/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144047748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nonparametric estimation via partial derivatives.","authors":"Xiaowu Dai","doi":"10.1093/jrsssb/qkae093","DOIUrl":"https://doi.org/10.1093/jrsssb/qkae093","url":null,"abstract":"<p><p>Traditional nonparametric estimation methods often lead to a slow convergence rate in large dimensions and require unrealistically large dataset sizes for reliable conclusions. We develop an approach based on partial derivatives, either observed or estimated, to effectively estimate the function at near-parametric convergence rates. This novel approach and computational algorithm could lead to methods useful to practitioners in many areas of science and engineering. Our theoretical results reveal behaviour universal to this class of nonparametric estimation problems. We explore a general setting involving tensor product spaces and build upon the smoothing spline analysis of variance framework. For <i>d</i>-dimensional models under full interaction, the optimal rates with gradient information on <i>p</i> covariates are identical to those for the <math><mo>(</mo> <mi>d</mi> <mo>-</mo> <mi>p</mi> <mo>)</mo></math> -interaction models without gradients and, therefore, the models are immune to the <i>curse of interaction</i>. For additive models, the optimal rates using gradient information are <math><msqrt><mi>n</mi></msqrt> </math> , thus achieving the <i>parametric rate</i>. We demonstrate aspects of the theoretical results through synthetic and real data applications.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"87 2","pages":"319-336"},"PeriodicalIF":3.1,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11985098/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144064677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mode-wise principal subspace pursuit and matrix spiked covariance model.","authors":"Runshi Tang, Ming Yuan, Anru R Zhang","doi":"10.1093/jrsssb/qkae088","DOIUrl":"10.1093/jrsssb/qkae088","url":null,"abstract":"<p><p>This paper introduces a novel framework called Mode-wise Principal Subspace Pursuit (MOP-UP) to extract hidden variations in both the row and column dimensions for matrix data. To enhance the understanding of the framework, we introduce a class of matrix-variate spiked covariance models that serve as inspiration for the development of the MOP-UP algorithm. The MOP-UP algorithm consists of two steps: Average Subspace Capture (ASC) and Alternating Projection. These steps are specifically designed to capture the row-wise and column-wise dimension-reduced subspaces which contain the most informative features of the data. ASC utilizes a novel average projection operator as initialization and achieves exact recovery in the noiseless setting. We analyse the convergence and non-asymptotic error bounds of MOP-UP, introducing a blockwise matrix eigenvalue perturbation bound that proves the desired bound, where classic perturbation bounds fail. The effectiveness and practical merits of the proposed framework are demonstrated through experiments on both simulated and real datasets. Lastly, we discuss generalizations of our approach to higher-order data.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"87 1","pages":"232-255"},"PeriodicalIF":3.1,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11809223/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143411335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extended fiducial inference: toward an automated process of statistical inference.","authors":"Faming Liang, Sehwan Kim, Yan Sun","doi":"10.1093/jrsssb/qkae082","DOIUrl":"10.1093/jrsssb/qkae082","url":null,"abstract":"<p><p>While fiducial inference was widely considered a big blunder by R.A. Fisher, the goal he initially set-'inferring the uncertainty of model parameters on the basis of observations'-has been continually pursued by many statisticians. To this end, we develop a new statistical inference method called extended Fiducial inference (EFI). The new method achieves the goal of fiducial inference by leveraging advanced statistical computing techniques while remaining scalable for big data. Extended Fiducial inference involves jointly imputing random errors realized in observations using stochastic gradient Markov chain Monte Carlo and estimating the inverse function using a sparse deep neural network (DNN). The consistency of the sparse DNN estimator ensures that the uncertainty embedded in observations is properly propagated to model parameters through the estimated inverse function, thereby validating downstream statistical inference. Compared to frequentist and Bayesian methods, EFI offers significant advantages in parameter estimation and hypothesis testing. Specifically, EFI provides higher fidelity in parameter estimation, especially when outliers are present in the observations; and eliminates the need for theoretical reference distributions in hypothesis testing, thereby automating the statistical inference process. Extended Fiducial inference also provides an innovative framework for semisupervised learning.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"87 1","pages":"98-131"},"PeriodicalIF":3.1,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11809222/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143400591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Catch me if you can: signal localization with knockoff <i>e</i>-values.","authors":"Paula Gablenz, Chiara Sabatti","doi":"10.1093/jrsssb/qkae042","DOIUrl":"10.1093/jrsssb/qkae042","url":null,"abstract":"<p><p>We consider problems where many, somewhat redundant, hypotheses are tested and we are interested in reporting the most precise rejections, with false discovery rate (FDR) control. This is the case, for example, when researchers are interested both in individual hypotheses as well as group hypotheses corresponding to intersections of sets of the original hypotheses, at several resolution levels. A concrete application is in genome-wide association studies, where, depending on the signal strengths, it might be possible to resolve the influence of individual genetic variants on a phenotype with greater or lower precision. To adapt to the unknown signal strength, analyses are conducted at multiple resolutions and researchers are most interested in the more precise discoveries. Assuring FDR control on the reported findings with these adaptive searches is, however, often impossible. To design a multiple comparison procedure that allows for an adaptive choice of resolution with FDR control, we leverage <i>e</i>-values and linear programming. We adapt this approach to problems where knockoffs and group knockoffs have been successfully applied to test conditional independence hypotheses. We demonstrate its efficacy by analysing data from the UK Biobank.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"87 1","pages":"56-73"},"PeriodicalIF":3.1,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11809227/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143400584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Model-assisted sensitivity analysis for treatment effects under unmeasured confounding via regularized calibrated estimation.","authors":"Zhiqiang Tan","doi":"10.1093/jrsssb/qkae034","DOIUrl":"10.1093/jrsssb/qkae034","url":null,"abstract":"<p><p>Consider sensitivity analysis for estimating average treatment effects under unmeasured confounding, assumed to satisfy a marginal sensitivity model. At the population level, we provide new representations for the sharp population bounds and doubly robust estimating functions. We also derive new, relaxed population bounds, depending on weighted linear outcome quantile regression. At the sample level, we develop new methods and theory for obtaining not only doubly robust point estimators for the relaxed population bounds with respect to misspecification of a propensity score model or an outcome mean regression model, but also model-assisted confidence intervals which are valid if the propensity score model is correctly specified, but the outcome quantile and mean regression models may be misspecified. The relaxed population bounds reduce to the sharp bounds if outcome quantile regression is correctly specified. For a linear outcome mean regression model, the confidence intervals are also doubly robust. Our methods involve regularized calibrated estimation, with Lasso penalties but carefully chosen loss functions, for fitting propensity score and outcome mean and quantile regression models. We present a simulation study and an empirical application to an observational study on the effects of right-heart catheterization. The proposed method is implemented in the R package RCALsa.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"86 5","pages":"1339-1363"},"PeriodicalIF":3.1,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11558804/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142631137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Interpretable discriminant analysis for functional data supported on random nonlinear domains with an application to Alzheimer's disease.","authors":"Eardi Lila, Wenbo Zhang, Swati Rane Levendovszky","doi":"10.1093/jrsssb/qkae023","DOIUrl":"10.1093/jrsssb/qkae023","url":null,"abstract":"<p><p>We introduce a novel framework for the classification of functional data supported on nonlinear, and possibly random, manifold domains. The motivating application is the identification of subjects with Alzheimer's disease from their cortical surface geometry and associated cortical thickness map. The proposed model is based upon a reformulation of the classification problem as a regularized multivariate functional linear regression model. This allows us to adopt a direct approach to the estimation of the most discriminant direction while controlling for its complexity with appropriate differential regularization. Our approach does not require prior estimation of the covariance structure of the functional predictors, which is computationally prohibitive in our application setting. We provide a theoretical analysis of the out-of-sample prediction error of the proposed model and explore the finite sample performance in a simulation setting. We apply the proposed method to a pooled dataset from Alzheimer's Disease Neuroimaging Initiative and Parkinson's Progression Markers Initiative. Through this application, we identify discriminant directions that capture both cortical geometric and thickness predictive features of Alzheimer's disease that are consistent with the existing neuroscience literature.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"86 4","pages":"1013-1044"},"PeriodicalIF":3.1,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11398888/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142299609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ting Ye, Zhonghua Liu, Baoluo Sun, Eric Tchetgen Tchetgen
{"title":"GENIUS-MAWII: for robust Mendelian randomization with many weak invalid instruments.","authors":"Ting Ye, Zhonghua Liu, Baoluo Sun, Eric Tchetgen Tchetgen","doi":"10.1093/jrsssb/qkae024","DOIUrl":"10.1093/jrsssb/qkae024","url":null,"abstract":"<p><p>Mendelian randomization (MR) addresses causal questions using genetic variants as instrumental variables. We propose a new MR method, G-Estimation under No Interaction with Unmeasured Selection (GENIUS)-MAny Weak Invalid IV, which simultaneously addresses the 2 salient challenges in MR: many weak instruments and widespread horizontal pleiotropy. Similar to MR-GENIUS, we use heteroscedasticity of the exposure to identify the treatment effect. We derive influence functions of the treatment effect, and then we construct a continuous updating estimator and establish its asymptotic properties under a many weak invalid instruments asymptotic regime by developing novel semiparametric theory. We also provide a measure of weak identification, an overidentification test, and a graphical diagnostic tool.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"86 4","pages":"1045-1067"},"PeriodicalIF":3.1,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11398887/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142299608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yachong Yang, Arun Kumar Kuchibhotla, Eric Tchetgen Tchetgen
{"title":"Doubly robust calibration of prediction sets under covariate shift.","authors":"Yachong Yang, Arun Kumar Kuchibhotla, Eric Tchetgen Tchetgen","doi":"10.1093/jrsssb/qkae009","DOIUrl":"10.1093/jrsssb/qkae009","url":null,"abstract":"<p><p>Conformal prediction has received tremendous attention in recent years and has offered new solutions to problems in missing data and causal inference; yet these advances have not leveraged modern semi-parametric efficiency theory for more efficient uncertainty quantification. We consider the problem of obtaining well-calibrated prediction regions that can data adaptively account for a shift in the distribution of covariates between training and test data. Under a covariate shift assumption analogous to the standard missing at random assumption, we propose a general framework based on efficient influence functions to construct well-calibrated prediction regions for the unobserved outcome in the test sample without compromising coverage.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"86 4","pages":"943-965"},"PeriodicalIF":3.1,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11398884/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142299607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}