The Canadian Journal of Statistics最新文献

筛选
英文 中文
Distributed learning for kernel mode–based regression 基于核模式回归的分布式学习
The Canadian Journal of Statistics Pub Date : 2024-09-03 DOI: 10.1002/cjs.11831
Tao Wang
{"title":"Distributed learning for kernel mode–based regression","authors":"Tao Wang","doi":"10.1002/cjs.11831","DOIUrl":"https://doi.org/10.1002/cjs.11831","url":null,"abstract":"We propose a parametric kernel mode–based regression built on the mode value, which provides robust and efficient estimators for datasets containing outliers or heavy‐tailed distributions. To address the challenges posed by massive datasets, we integrate this regression method with distributed statistical learning techniques, which greatly reduces the required amount of primary memory and simultaneously accommodates heterogeneity in the estimation process. By approximating the local kernel objective function with a least squares format, we are able to preserve compact statistics for each worker machine, facilitating the reconstruction of estimates for the entire dataset with minimal asymptotic approximation error. Additionally, we explore shrinkage estimation through local quadratic approximation, showcasing that the resulting estimator possesses the oracle property through an adaptive LASSO approach. The finite‐sample performance of the developed method is illustrated using simulations and real data analysis.","PeriodicalId":501595,"journal":{"name":"The Canadian Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142225251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient semiparametric estimation in two‐sample comparison via semisupervised learning 通过半监督学习进行双样本比较中的高效半参数估计
The Canadian Journal of Statistics Pub Date : 2024-09-03 DOI: 10.1002/cjs.11813
Tao Tan, Shuyi Zhang, Yong Zhou
{"title":"Efficient semiparametric estimation in two‐sample comparison via semisupervised learning","authors":"Tao Tan, Shuyi Zhang, Yong Zhou","doi":"10.1002/cjs.11813","DOIUrl":"https://doi.org/10.1002/cjs.11813","url":null,"abstract":"We develop a general semisupervised framework for statistical inference in the two‐sample comparison setting. Although the supervised Mann–Whitney statistic outperforms many estimators in the two‐sample problem for nonnormally distributed responses, it is excessively inefficient because it ignores large amounts of unlabelled information. To borrow strength from unlabelled data, we propose a class of efficient and adaptive estimators that use two‐step semiparametric imputation. The probabilistic index model is adopted primarily to achieve dimension reduction for multivariate covariates, and a follow‐up reweighting step balances the contributions of labelled and unlabelled data. The asymptotic properties of our estimator are derived with variance comparison through a phase diagram. Efficiency theory shows our estimators achieve the semiparametric variance lower bound if the probabilistic index model is correctly specified, and are more efficient than their supervised counterpart when the model is not degenerate. The asymptotic variance is estimated through a two‐step perturbation resampling procedure. To gauge the finite sample performance, we conducted extensive simulation studies which verify the adaptive nature of our methods with respect to model misspecification. To illustrate the merits of our proposed method, we analyze a dataset concerning homelessness in Los Angeles.","PeriodicalId":501595,"journal":{"name":"The Canadian Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142198225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new copula regression model for hierarchical data 分层数据的新型共轭回归模型
The Canadian Journal of Statistics Pub Date : 2024-08-30 DOI: 10.1002/cjs.11830
Talagbe Gabin Akpo, Louis‐Paul Rivest
{"title":"A new copula regression model for hierarchical data","authors":"Talagbe Gabin Akpo, Louis‐Paul Rivest","doi":"10.1002/cjs.11830","DOIUrl":"https://doi.org/10.1002/cjs.11830","url":null,"abstract":"This article proposes multivariate copula models for hierarchical data. They account for two types of correlation: one is between variables measured on the same unit, and the other is a correlation between units in the same cluster. This model is used to carry out copula regression for hierarchical data that gives cluster‐specific prediction curves. In the simple case where a cluster contains two units and where two variables are measured on each one, the new model is constructed with a ‐vine. The proposed copula density is expressed in terms of three copula families. When the copula families and the marginal distributions are normal, the model is equivalent to a normal linear mixed model with random cluster‐specific intercepts. Methods to select the three copula families and to estimate their parameters are proposed. We perform Monte Carlo studies of the sampling properties of these estimators and of out‐of‐sample predictions. The new model is applied to a dataset on the marks of students in several schools.","PeriodicalId":501595,"journal":{"name":"The Canadian Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142198226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast and scalable inference for spatial extreme value models 快速、可扩展的空间极值模型推理
The Canadian Journal of Statistics Pub Date : 2024-08-22 DOI: 10.1002/cjs.11829
Meixi Chen, Reza Ramezan, Martin Lysy
{"title":"Fast and scalable inference for spatial extreme value models","authors":"Meixi Chen, Reza Ramezan, Martin Lysy","doi":"10.1002/cjs.11829","DOIUrl":"https://doi.org/10.1002/cjs.11829","url":null,"abstract":"The generalized extreme value (GEV) distribution is a popular model for analyzing and forecasting extreme weather data. To increase prediction accuracy, spatial information is often pooled via a latent Gaussian process (GP) on the GEV parameters. Inference for GEV‐GP models is typically carried out using Markov Chain Monte Carlo (MCMC) methods, or using approximate inference methods such as the integrated nested Laplace approximation (INLA). However, MCMC becomes prohibitively slow as the number of spatial locations increases, whereas INLA is applicable in practice only to a limited subset of GEV‐GP models. In this article, we revisit the original Laplace approximation for fitting spatial GEV models. In combination with a popular sparsity‐inducing spatial covariance approximation technique, we show through simulations that our approach accurately estimates the Bayesian predictive distribution of extreme weather events, is scalable to several thousand spatial locations, and is several orders of magnitude faster than MCMC. A case study in forecasting extreme snowfall across Canada is presented.","PeriodicalId":501595,"journal":{"name":"The Canadian Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142198228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A framework for incorporating behavioural change into individual‐level spatial epidemic models 将行为变化纳入个人层面空间流行病模型的框架
The Canadian Journal of Statistics Pub Date : 2024-08-22 DOI: 10.1002/cjs.11828
Madeline A. Ward, Rob Deardon, Lorna E. Deeth
{"title":"A framework for incorporating behavioural change into individual‐level spatial epidemic models","authors":"Madeline A. Ward, Rob Deardon, Lorna E. Deeth","doi":"10.1002/cjs.11828","DOIUrl":"https://doi.org/10.1002/cjs.11828","url":null,"abstract":"Epidemic trajectories can be substantially impacted by people modifying their behaviours in response to changes in their perceived risk of spreading or contracting the disease. However, most infectious disease models assume a stable population behaviour. We present a flexible new class of models, called behavioural change individual‐level models (BC‐ILMs), that incorporate both individual‐level covariate information and a data‐driven behavioural change effect. Focusing on spatial BC‐ILMs, we consider four “alarm” functions to model the effect of behavioural change as a function of infection prevalence over time. Through simulation studies, we find that if behavioural change is present, using an alarm function, even if specified incorrectly, will result in an improvement in posterior predictive performance over a model that assumes stable population behaviour. The methods are applied to data from the 2001 U.K. foot and mouth disease epidemic. The results show some evidence of a behavioural change effect, although it may not meaningfully impact model fit compared to a simpler spatial ILM in this dataset.","PeriodicalId":501595,"journal":{"name":"The Canadian Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142198227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Debiased lasso after sample splitting for estimation and inference in high‐dimensional generalized linear models 用于高维广义线性模型估计和推理的样本分割后去偏套索技术
The Canadian Journal of Statistics Pub Date : 2024-08-22 DOI: 10.1002/cjs.11827
Omar Vazquez, Bin Nan
{"title":"Debiased lasso after sample splitting for estimation and inference in high‐dimensional generalized linear models","authors":"Omar Vazquez, Bin Nan","doi":"10.1002/cjs.11827","DOIUrl":"https://doi.org/10.1002/cjs.11827","url":null,"abstract":"We consider random sample splitting for estimation and inference in high‐dimensional generalized linear models (GLMs), where we first apply the lasso to select a submodel using one subsample and then apply the debiased lasso to fit the selected model using the remaining subsample. We show that a sample splitting procedure based on the debiased lasso yields asymptotically normal estimates under mild conditions and that multiple splitting can address the loss of efficiency. Our simulation results indicate that using the debiased lasso instead of the standard maximum likelihood method in the estimation stage can vastly reduce the bias and variance of the resulting estimates. Furthermore, our multiple splitting debiased lasso method has better numerical performance than some existing methods for high‐dimensional GLMs proposed in the recent literature. We illustrate the proposed multiple splitting method with an analysis of the smoking data of the Mid‐South Tobacco Case–Control Study.","PeriodicalId":501595,"journal":{"name":"The Canadian Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142225252","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Variable selection in modelling clustered data via within‐cluster resampling 通过簇内再采样建立聚类数据模型时的变量选择
The Canadian Journal of Statistics Pub Date : 2024-08-01 DOI: 10.1002/cjs.11824
Shangyuan Ye, Tingting Yu, Daniel A. Caroff, Susan S. Huang, Bo Zhang, Rui Wang
{"title":"Variable selection in modelling clustered data via within‐cluster resampling","authors":"Shangyuan Ye, Tingting Yu, Daniel A. Caroff, Susan S. Huang, Bo Zhang, Rui Wang","doi":"10.1002/cjs.11824","DOIUrl":"https://doi.org/10.1002/cjs.11824","url":null,"abstract":"In many biomedical applications, there is a need to build risk‐adjustment models based on clustered data. However, methods for variable selection that are applicable to clustered discrete data settings with a large number of candidate variables and potentially large cluster sizes are lacking. We develop a new variable selection approach that combines within‐cluster resampling techniques with penalized likelihood methods to select variables for high‐dimensional clustered data. We derive an upper bound on the expected number of falsely selected variables, demonstrate the oracle properties of the proposed method and evaluate the finite sample performance of the method through extensive simulations. We illustrate the proposed approach using a colon surgical site infection data set consisting of 39,468 individuals from 149 hospitals to build risk‐adjustment models that account for both the main effects of various risk factors and their two‐way interactions.","PeriodicalId":501595,"journal":{"name":"The Canadian Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint analysis of longitudinal count and binary response data in the presence of outliers 对存在异常值的纵向计数和二元响应数据进行联合分析
The Canadian Journal of Statistics Pub Date : 2024-08-01 DOI: 10.1002/cjs.11819
Sanjoy Sinha
{"title":"Joint analysis of longitudinal count and binary response data in the presence of outliers","authors":"Sanjoy Sinha","doi":"10.1002/cjs.11819","DOIUrl":"https://doi.org/10.1002/cjs.11819","url":null,"abstract":"In this article, we develop an innovative, robust method for jointly analyzing longitudinal count and binary responses. The method is useful for bounding the influence of potential outliers in the data when estimating the model parameters. We use a log‐linear model for the count response and a logistic regression model for the binary response, where the two response processes are linked through a set of association parameters. The asymptotic properties of the robust estimators are briefly studied. The empirical properties of the estimators are studied based on simulations. The study shows that the proposed estimators are approximately unbiased and also efficient when fitting a joint model to data contaminated with outliers. We also apply the proposed method to some real longitudinal survey data obtained from a health study.","PeriodicalId":501595,"journal":{"name":"The Canadian Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881169","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Robust change point detection for high‐dimensional linear models with tolerance for outliers and heavy tails 容许异常值和重尾的高维线性模型的稳健变化点检测
The Canadian Journal of Statistics Pub Date : 2024-08-01 DOI: 10.1002/cjs.11826
Zhi Yang, Liwen Zhang, Siyu Sun, Bin Liu
{"title":"Robust change point detection for high‐dimensional linear models with tolerance for outliers and heavy tails","authors":"Zhi Yang, Liwen Zhang, Siyu Sun, Bin Liu","doi":"10.1002/cjs.11826","DOIUrl":"https://doi.org/10.1002/cjs.11826","url":null,"abstract":"This article focuses on detecting change points in high‐dimensional linear regression models with piecewise constant regression coefficients, moving beyond the conventional reliance on strict Gaussian or sub‐Gaussian noise assumptions. In the face of real‐world complexities, where noise often deviates into uncertain or heavy‐tailed distributions, we propose two tailored algorithms: a dynamic programming algorithm (DPA) for improved localization accuracy, and a binary segmentation algorithm (BSA) optimized for computational efficiency. These solutions are designed to be flexible, catering to increasing sample sizes and data dimensions, and offer a robust estimation of change points without requiring specific moments of the noise distribution. The efficacy of DPA and BSA is thoroughly evaluated through extensive simulation studies and application to real datasets, showing their competitive edge in adaptability and performance.","PeriodicalId":501595,"journal":{"name":"The Canadian Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bayesian jackknife empirical likelihood‐based inference for missing data and causal inference 针对缺失数据和因果推断的基于经验似然法的贝叶斯千斤顶推断法
The Canadian Journal of Statistics Pub Date : 2024-08-01 DOI: 10.1002/cjs.11825
Sixia Chen, Yuke Wang, Yichuan Zhao
{"title":"Bayesian jackknife empirical likelihood‐based inference for missing data and causal inference","authors":"Sixia Chen, Yuke Wang, Yichuan Zhao","doi":"10.1002/cjs.11825","DOIUrl":"https://doi.org/10.1002/cjs.11825","url":null,"abstract":"Missing data reduce the representativeness of the sample and can lead to inference problems. In this article, we apply the Bayesian jackknife empirical likelihood (BJEL) method for inference on data that are missing at random, as well as for causal inference. The semiparametric fractional imputation estimator, propensity score‐weighted estimator, and doubly robust estimator are used for constructing the jackknife pseudo values, which are needed for conducting BJEL‐based inference with missing data. Existing methods, such as normal approximation and JEL, are compared with the BJEL approach in a simulation study. The proposed approach shows better performance in many scenarios in terms of credible intervals. Furthermore, we demonstrate the application of the proposed approach for causal inference problems in a study of risk factors for impaired kidney function.","PeriodicalId":501595,"journal":{"name":"The Canadian Journal of Statistics","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141881172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信