{"title":"A multivariate Poisson model based on a triangular comonotonic shock construction","authors":"Orla A. Murphy, Juliana Schulz","doi":"10.1002/cjs.70010","DOIUrl":"https://doi.org/10.1002/cjs.70010","url":null,"abstract":"<p>Multi-dimensional data frequently occur in many different fields, including risk management, insurance, biology, environmental sciences, and many more. In analyzing multivariate data, it is imperative that the underlying modelling assumptions adequately reflect both the marginal behaviour and the associations between components. This article focuses specifically on developing a new multivariate Poisson model appropriate for multi-dimensional count data. The proposed formulation is based on convolutions of comonotonic shock vectors with Poisson-distributed components and allows for flexibility in capturing different degrees of positive dependence. In this article, we will present the general model framework along with various distributional properties. Several estimation techniques will be explored and assessed both through simulations and in a real data application involving extreme rainfall events.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.70010","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Johanna de Haan-Ward, Douglas G. Woolford, Simon J. Bonner
{"title":"Predicting rare events using training data from stratified sampling designs, with application to human-caused wildfire prediction","authors":"Johanna de Haan-Ward, Douglas G. Woolford, Simon J. Bonner","doi":"10.1002/cjs.70008","DOIUrl":"https://doi.org/10.1002/cjs.70008","url":null,"abstract":"<p>Response-based sampling is often used in modelling rare events from large, imbalanced data for efficiency. When modelling the event with logistic regression, the sampling design may be adjusted for using sampling weights or an offset. We propose a stratified sampling design for modelling rare events with large data which improves on previous methods by providing unbiased estimates of the standard errors of the coefficients in a multiple logistic regression scenario. We use multiple intercepts to model the incidence in the sampled data, then adjust each intercept via a stratum-specific offset. Our simulations provide no evidence of bias in the estimated logistic regression coefficients or their standard errors. We apply this method to spatio-temporal, fine-scale human-caused fire occurrence modelling for a region in northwestern Ontario, Canada, illustrating how the stratified sampling approach results in more locally precise estimates of fire occurrence.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.70008","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unified inference for longitudinal/functional data quantile dynamic additive models","authors":"Qian Huang, Tao Li, Jinhong You, Liwen Zhang","doi":"10.1002/cjs.70006","DOIUrl":"https://doi.org/10.1002/cjs.70006","url":null,"abstract":"<p>We investigate the unified inference of a time-varying additive model under the quantile regression framework, considering both sparse and dense longitudinal or functional data. For convolution-type smoothed objective functions, we propose a two-step method for estimating both the trend and the component functions. Theoretical analysis shows that the two-step estimators share the same asymptotic distribution as the oracle estimators, while the convergence rates and limiting variance functions differ between sparse and dense situations. However, making a subjective choice between these two cases can lead to incorrect statistical inferences. To address this issue, we develop sandwich formulas for variance estimations. This allows us to establish a unified inference without the need to decide whether the data are sparse or dense. Via simulation studies, we assess the finite-sample performance of the proposed methods. Finally, analyses of two different types of real data illustrate our proposed methods.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient and model-agnostic parameter estimation under privacy-preserving post-randomization data","authors":"Qinglong Tian, Jiwei Zhao","doi":"10.1002/cjs.70003","DOIUrl":"https://doi.org/10.1002/cjs.70003","url":null,"abstract":"<p>Balancing data privacy with public access is critical for sensitive datasets. However, even after de-identification, the data are still vulnerable to, for example, inference attacks (by matching some keywords with external datasets). Statistical disclosure control (SDC) methods offer additional protection, and the post-randomization method (PRAM) adds noise to data to achieve this goal. However, PRAM-perturbed data pose challenges for analysis, as directly using the perturbed data leads to biased parameter estimates. This article addresses parameter estimation when data are perturbed using PRAM for privacy. While existing methods suffer from limitations like being parameter-specific, model-dependent and lacking optimality guarantees, our proposed method overcomes these limitations. Our approach applies to general parameters defined through estimating equations and makes no assumptions about the underlying data model. Furthermore, we prove that the proposed estimator achieves the semiparametric efficiency bound, making it asymptotically optimal in terms of estimation efficiency.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.70003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Correction to “Matching distributions for survival data”","authors":"","doi":"10.1002/cjs.70007","DOIUrl":"https://doi.org/10.1002/cjs.70007","url":null,"abstract":"<p>Jiang, Q., Xia, Y., and Liang, B. (2022) Matching distributions for survival data. <i>The Canadian Journal of Statistics</i>, 50:751–775.</p><p>The name of the first author “Qiang JIANG” was incorrect. This should have been: “Qing JIANG”.</p><p>We apologize for this error.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 2","pages":""},"PeriodicalIF":0.8,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.70007","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144108866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimal relevant subset designs in nonlinear models","authors":"Adam Lane","doi":"10.1002/cjs.70004","DOIUrl":"https://doi.org/10.1002/cjs.70004","url":null,"abstract":"<p>It is well known that certain ancillary statistics form a relevant subset, a subset of the sample space on which inference should be restricted, and that conditioning on such ancillary statistics reduces the dimension of the data without a loss of information. The use of ancillary statistics in post-data inference has received significant attention; however, their role in the design of experiments has not been well characterized. Ancillary statistics are not known prior to data collection and as a result cannot be incorporated into the design a priori. Conversely, in sequential experiments the ancillary statistics based on the data from the preceding observations are known and can be used to determine the design assignment of the current observation. The main results of this work describe the benefits of incorporating ancillary statistics, specifically the ancillary statistic that constitutes a relevant subset, into adaptive designs.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhuoran Zhang, Olivia Bernstein Morgan, Daniel L. Gillen, for the Alzheimer's Disease Neuroimaging Initiative
{"title":"Reweighted penalized regression for convenience samples","authors":"Zhuoran Zhang, Olivia Bernstein Morgan, Daniel L. Gillen, for the Alzheimer's Disease Neuroimaging Initiative","doi":"10.1002/cjs.70005","DOIUrl":"https://doi.org/10.1002/cjs.70005","url":null,"abstract":"<p>Modern epidemiological studies are often characterized by extensive data collection, which facilitates building high-dimensional predictive models. With large samples often conveniently sampled, weighted penalized regression models are commonly applied to provide improved prediction. In this article, we empirically show that weighted ridge regression models may yield suboptimal results because of the lack of flexibility in the penalty structure. We propose a generalized weighted ridge regression (GWRR) estimation procedure that allows for the adjustment of sampling weights in the penalty structure. We derive the asymptotic properties of the proposed GWRR estimator and provide a computationally efficient closed-form solution. We demonstrate the performance of the proposed GWRR estimator and justify the asymptotic variance via simulation studies. Finally, we illustrate the utility of our proposed estimator through an application to the prediction of mini-mental state examination (MMSE) scores.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Noisy matrix completion for longitudinal data with subject- and time-specific covariates","authors":"Zhaohan Sun, Yeying Zhu, Joel A. Dubin","doi":"10.1002/cjs.70002","DOIUrl":"https://doi.org/10.1002/cjs.70002","url":null,"abstract":"<p>In this article, we consider the imputation of missing responses in a longitudinal dataset via matrix completion. We propose a fixed-effect, longitudinal, low-rank model that incorporates both subject-specific and time-specific covariates. To solve the optimization problem, a two-step optimization algorithm is proposed, which provides good statistical properties for the estimation of the fixed effects and the low-rank term. In a theoretical investigation, the non-asymptotic error bounds on the fixed effects and low-rank term are presented. We illustrate the finite-sample performance of the proposed algorithm via simulation studies, and apply our method to a power plant SO<span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <msub>\u0000 <mrow></mrow>\u0000 <mrow>\u0000 <mn>2</mn>\u0000 </mrow>\u0000 </msub>\u0000 </mrow>\u0000 <annotation>$$ {}_2 $$</annotation>\u0000 </semantics></math> emissions dataset in which the monthly recorded amounts of emissions data on monitors are subject to missingness.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.70002","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Sample empirical likelihood methods for causal inference","authors":"Jingyue Huang, Changbao Wu, Leilei Zeng","doi":"10.1002/cjs.70000","DOIUrl":"https://doi.org/10.1002/cjs.70000","url":null,"abstract":"<p>Causal inference plays a crucial role in understanding the true impact of interventions, medical treatments, policies, or actions, enabling informed decision making and providing insights into the underlying mechanisms that shape our world. In this article, we establish a framework for the estimation of and inference concerning average treatment effects using a two-sample empirical likelihood function. Two different approaches to incorporating propensity scores are developed. The first approach introduces propensity-score-calibrated constraints in addition to the standard model-calibration constraints; the second approach uses the propensity scores to form weighted versions of the model-calibration constraints. The resulting estimators from both approaches are doubly robust. The limiting distributions of the two-sample empirical likelihood ratio statistics are derived, facilitating the construction of confidence intervals and hypothesis tests for the average treatment effect. Bootstrap methods for constructing sample empirical likelihood ratio confidence intervals are also discussed for both approaches. The finite-sample performance of each method is investigated via simulation studies.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Doubly robust criterion for causal inference","authors":"Takamichi Baba, Yoshiyuki Ninomiya","doi":"10.1002/cjs.70001","DOIUrl":"https://doi.org/10.1002/cjs.70001","url":null,"abstract":"<p>In causal inference, semiparametric estimation using propensity scores has rapidly developed in various directions. At the same time, although model selection is indispensable in statistical analysis, an information criterion for selecting the regression structure between the potential outcome and explanatory variables has not been well developed. Here, based on the original definition of AIC, we derive an AIC-type criterion for propensity score analysis. A risk based on the Kullback–Leibler divergence is defined as the cornerstone, and general causal inference models and general causal effects are treated. Considering the high importance of doubly robust estimation, we make the information criterion itself doubly robust so that it is an asymptotically unbiased estimator of the risk even under some model misspecification. In simulation studies, we compare the derived criterion with an existing weighted quasi-likelihood information criterion and confirm that the former outperforms the latter. Real data analyses indicate that results using the two criteria can differ significantly.</p>","PeriodicalId":55281,"journal":{"name":"Canadian Journal of Statistics-Revue Canadienne De Statistique","volume":"53 3","pages":""},"PeriodicalIF":1.0,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.70001","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144918719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}