BiometrikaPub Date : 2023-09-14DOI: 10.1093/biomet/asad056
Wei Li, Zitong Lu, Jinzhu Jia, Min Xie, Zhi Geng
{"title":"Retrospective causal inference with multiple effect variables","authors":"Wei Li, Zitong Lu, Jinzhu Jia, Min Xie, Zhi Geng","doi":"10.1093/biomet/asad056","DOIUrl":"https://doi.org/10.1093/biomet/asad056","url":null,"abstract":"Summary As highlighted in Dawid (2000) and Pearl & Mackenzie (2018), deducing the causes of given effects is a more challenging problem than evaluating the effects of causes in causal inference. Lu et al. (2023) proposed an approach for deducing causes of a single effect variable based on posterior causal effects. In many applications, there are multiple effect variables, and thus they can be used simultaneously to more accurately deduce the causes. To retrospectively deduce causes from multiple effects, we propose multivariate posterior total, intervention and direct causal effects conditional on the observed evidence. We describe the assumptions of no-confounding and monotonicity, under which we prove identifiability of the multivariate posterior causal effects and provide their identification equations. The proposed approach can be applied for causal attributions, medical diagnosis, blame and responsibility in various studies with multiple effect or outcome variables. Two examples are used to illustrate the proposed approach.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135552839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2023-09-09DOI: 10.1093/biomet/asad053
Alexander Aue, Prabir Burman
{"title":"Estimation of prediction error in time series","authors":"Alexander Aue, Prabir Burman","doi":"10.1093/biomet/asad053","DOIUrl":"https://doi.org/10.1093/biomet/asad053","url":null,"abstract":"Summary The accurate estimation of prediction errors in time series is an important problem, which has immediate implications for the accuracy of prediction intervals as well as the quality of a number of widely used time series model selection criteria such as the Akaike information criterion. Except for simple cases, however, it is difficult or even impossible to obtain exact analytical expressions for one-step and multi-step predictions. This may be one of the reasons that, unlike in the independent case (see Efron, 2004), up to now there has been no fully established methodology for time series prediction error estimation. Starting from an approximation to the bias-variance decomposition of the squared prediction error, a method for accurate estimation of prediction errors in both univariate and multivariate stationary time series is developed in this article. In particular, several estimates are derived for a general class of predictors that includes most of the popular linear, nonlinear, parametric and nonparametric time series models used in practice, with causal invertible autoregressive moving average and nonparametric autoregressive processes discussed as lead examples. Simulations demonstrate that the proposed estimators perform quite well in finite samples. The estimates may also be used for model selection when the purpose of modelling is prediction.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136108242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2023-09-01DOI: 10.1093/biomet/asad050
N. W. Koning, J. Hemerik
{"title":"More Efficient Exact Group Invariance Testing: using a Representative Subgroup","authors":"N. W. Koning, J. Hemerik","doi":"10.1093/biomet/asad050","DOIUrl":"https://doi.org/10.1093/biomet/asad050","url":null,"abstract":"\u0000 We consider testing invariance of a distribution under an algebraic group of transformations, such as permutations or sign-flips. As such groups are typically huge, tests based on the full group are often computationally infeasible. Hence, it is standard practice to use a random subset of transformations. We improve upon this by replacing the random subset with a strategically chosen, fixed subgroup of transformations. In a generalized location model, we show that the resulting tests are often consistent for lower signal-to-noise ratios. Moreover, we establish an analogy between the power improvement and switching from a t-test to a Z-test under normality. Importantly, in permutation-based multiple testing, the efficiency gain with our approach can be huge, since we attain the same power with much fewer permutations.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48678941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2023-09-01Epub Date: 2023-01-19DOI: 10.1093/biomet/asad004
Tommaso Rigon, Amy H Herring, David B Dunson
{"title":"A generalized Bayes framework for probabilistic clustering.","authors":"Tommaso Rigon, Amy H Herring, David B Dunson","doi":"10.1093/biomet/asad004","DOIUrl":"10.1093/biomet/asad004","url":null,"abstract":"<p><p>Loss-based clustering methods, such as k-means clustering and its variants, are standard tools for finding groups in data. However, the lack of quantification of uncertainty in the estimated clusters is a disadvantage. Model-based clustering based on mixture models provides an alternative approach, but such methods face computational problems and are highly sensitive to the choice of kernel. In this article we propose a generalized Bayes framework that bridges between these paradigms through the use of Gibbs posteriors. In conducting Bayesian updating, the loglikelihood is replaced by a loss function for clustering, leading to a rich family of clustering methods. The Gibbs posterior represents a coherent updating of Bayesian beliefs without needing to specify a likelihood for the data, and can be used for characterizing uncertainty in clustering. We consider losses based on Bregman divergence and pairwise similarities, and develop efficient deterministic algorithms for point estimation along with sampling algorithms for uncertainty quantification. Several existing clustering algorithms, including k-means, can be interpreted as generalized Bayes estimators in our framework, and thus we provide a method of uncertainty quantification for these approaches, allowing, for example, calculation of the probability that a data point is well clustered.</p>","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":"559-578"},"PeriodicalIF":2.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11840691/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46381325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2023-09-01Epub Date: 2022-11-02DOI: 10.1093/biomet/asac059
Yangjianchen Xu, Donglin Zeng, D Y Lin
{"title":"Marginal proportional hazards models for multivariate interval-censored data.","authors":"Yangjianchen Xu, Donglin Zeng, D Y Lin","doi":"10.1093/biomet/asac059","DOIUrl":"10.1093/biomet/asac059","url":null,"abstract":"<p><p>Multivariate interval-censored data arise when there are multiple types of events or clusters of study subjects, such that the event times are potentially correlated and when each event is only known to occur over a particular time interval. We formulate the effects of potentially time-varying covariates on the multivariate event times through marginal proportional hazards models while leaving the dependence structures of the related event times unspecified. We construct the nonparametric pseudolikelihood under the working assumption that all event times are independent, and we provide a simple and stable EM-type algorithm. The resulting nonparametric maximum pseudolikelihood estimators for the regression parameters are shown to be consistent and asymptotically normal, with a limiting covariance matrix that can be consistently estimated by a sandwich estimator under arbitrary dependence structures for the related event times. We evaluate the performance of the proposed methods through extensive simulation studies and present an application to data from the Atherosclerosis Risk in Communities Study.</p>","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"110 3","pages":"815-830"},"PeriodicalIF":2.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10434824/pdf/nihms-1874830.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10490393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2023-09-01DOI: 10.1093/biomet/asac065
Jieru Shi, Zhenke Wu, Walter Dempsey
{"title":"ASSESSING TIME-VARYING CAUSAL EFFECT MODERATION IN THE PRESENCE OF CLUSTER-LEVEL TREATMENT EFFECT HETEROGENEITY AND INTERFERENCE.","authors":"Jieru Shi, Zhenke Wu, Walter Dempsey","doi":"10.1093/biomet/asac065","DOIUrl":"https://doi.org/10.1093/biomet/asac065","url":null,"abstract":"<p><p>The micro-randomized trial (MRT) is a sequential randomized experimental design to empirically evaluate the effectiveness of mobile health (mHealth) intervention components that may be delivered at hundreds or thousands of decision points. MRTs have motivated a new class of causal estimands, termed \"causal excursion effects\", for which semiparametric inference can be conducted via a weighted, centered least squares criterion (Boruvka et al., 2018). Existing methods assume between-subject independence and non-interference. Deviations from these assumptions often occur. In this paper, causal excursion effects are revisited under potential cluster-level treatment effect heterogeneity and interference, where the treatment effect of interest may depend on cluster-level moderators. Utility of the proposed methods is shown by analyzing data from a multi-institution cohort of first year medical residents in the United States.</p>","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"110 3","pages":"645-662"},"PeriodicalIF":2.7,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10501736/pdf/nihms-1882489.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10653942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2023-08-31DOI: 10.1093/biomet/asad049
Long Feng, Guang Yang
{"title":"Deep Kronecker Network","authors":"Long Feng, Guang Yang","doi":"10.1093/biomet/asad049","DOIUrl":"https://doi.org/10.1093/biomet/asad049","url":null,"abstract":"Summary We develop a novel framework named Deep Kronecker Network for the analysis of medical imaging data, including magnetic resonance imaging (MRI), functional MRI, computed tomography, and more. Medical imaging data differs from general images in two main aspects: i) the sample size is often considerably smaller, and ii) the interpretation of the model is usually more crucial than predicting the outcome. As a result, standard methods such as convolutional neural networks cannot be directly applied to medical imaging analysis. Therefore, we propose the Deep Kronecker Network, which can adapt to the low sample size constraint and offer the desired model interpretation. Our approach is versatile, as it works for both matrix and tensor represented image data and can be applied to discrete and continuous outcomes. The Deep Kronecker network is built upon a Kronecker product structure, which implicitly enforces a piecewise smooth property on coefficients. Moreover, our approach resembles a fully convolutional network as the Kronecker structure can be expressed in a convolutional form. Interestingly, our approach also has strong connections to the tensor regression framework proposed by Zhou et al. (2013), which imposes a canonical low-rank structure on tensor coefficients. We conduct both classification and regression analyses using real MRI data from the Alzheimer’s Disease Neuroimaging Initiative to demonstrate the effectiveness of our approach.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135830829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2023-08-07DOI: 10.1093/biomet/asad048
Yicheng Li, Haobo Zhang, Qian Lin
{"title":"Kernel interpolation generalizes poorly","authors":"Yicheng Li, Haobo Zhang, Qian Lin","doi":"10.1093/biomet/asad048","DOIUrl":"https://doi.org/10.1093/biomet/asad048","url":null,"abstract":"Summary One of the most interesting problems in the recent renaissance of the studies in kernel regression might be whether kernel interpolation can generalize well, since it may help us understand the ‘benign overfitting phenomenon’ reported in the literature on deep networks. In this paper, under mild conditions, we show that, for any ε&gt;0, the generalization error of kernel interpolation is lower bounded by Ω(n−ε). In other words, the kernel interpolation generalizes poorly for a large class of kernels. As a direct corollary, we can show that overfitted wide neural networks defined on the sphere generalize poorly.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135904639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2023-08-02DOI: 10.1093/biomet/asad047
Haibing Zhao, Huijuan Zhou
{"title":"τ -censored weighted Benjamini-Hochberg procedures under independence","authors":"Haibing Zhao, Huijuan Zhou","doi":"10.1093/biomet/asad047","DOIUrl":"https://doi.org/10.1093/biomet/asad047","url":null,"abstract":"\u0000 In the field of multiple hypothesis testing, auxiliary information can be leveraged to enhance the efficiency of test procedures. A common way to make use of auxiliary information is by weighting p-values. However, when the weights are learned from data, controlling the finite-sample false discovery rate becomes challenging, and most existing weighted procedures only guarantee false discovery rate control in an asymptotic limit. In a recent study conducted by Ignatiadis & Huber (2021), a novel τ-censored weighted Benjamini-Hochberg procedure was proposed to control the finite-sample false discovery rate. The authors employed the cross-weighting approach to learn weights for the p-values. This approach randomly splits the data into several folds and constructs a weight for each p-value Pi using the p-values outside the fold containing Pi. Cross-weighting does not exploit the p-value information inside the fold and only balances the weights within each fold, which may result in a loss of power. In this article, we introduce two methods for constructing data-driven weights for τ-censored weighted Benjamini-Hochberg procedures under independence. They provide new insight into masking p-values to prevent overfitting in multiple testing. The first method utilizes a leave-one-out technique, where all but one of the p-values are used to learn a weight for each p-value. This technique masks the information of a p-value in its weight by calculating the infimum of the weight with respect to the p-value. The second method uses partial information from each p-value to construct weights and utilizes the conditional distributions of the null p-values to establish false discovery rate control. Additionally, we propose two methods for estimating the null proportion and demonstrate how to integrate null-proportion adaptivity into the proposed weights to improve power.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49253424","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
BiometrikaPub Date : 2023-07-27DOI: 10.1093/biomet/asad046
Ruijian Han, Lan Luo, Yuanyuan Lin, Jian Huang
{"title":"Online Inference with Debiased Stochastic Gradient Descent","authors":"Ruijian Han, Lan Luo, Yuanyuan Lin, Jian Huang","doi":"10.1093/biomet/asad046","DOIUrl":"https://doi.org/10.1093/biomet/asad046","url":null,"abstract":"\u0000 We propose a debiased stochastic gradient descent algorithm for online statistical inference with high-dimensional data. Our approach combines the debiasing technique developed in high-dimensional statistics with the stochastic gradient descent algorithm. It can be used for efficiently constructing confidence intervals in an online fashion. Our proposed algorithm has several appealing aspects: first, as a one-pass algorithm, it reduces the time complexity; in addition, each update step requires only the current data together with the previous estimate, which reduces the space complexity. We establish the asymptotic normality of the proposed estimator under mild conditions on the sparsity level of the parameter and the data distribution. We conduct numerical experiments to demonstrate the proposed debiased stochastic gradient descent algorithm reaches nominal coverage probability. Furthermore, we illustrate our method with a high-dimensional text dataset.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":" ","pages":""},"PeriodicalIF":2.7,"publicationDate":"2023-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44970146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}