具有非多项式维度滋扰参数的可能错误定义的广义线性模型的推理

IF 2.4 2区 数学 Q2 BIOLOGY
Biometrika Pub Date : 2024-05-24 DOI:10.1093/biomet/asae024
Shaoxin Hong, Jiancheng Jiang, Xuejun Jiang, Haofeng Wang
{"title":"具有非多项式维度滋扰参数的可能错误定义的广义线性模型的推理","authors":"Shaoxin Hong, Jiancheng Jiang, Xuejun Jiang, Haofeng Wang","doi":"10.1093/biomet/asae024","DOIUrl":null,"url":null,"abstract":"\n It is routine practice in statistical modelling to first select variables and then make inference for the selected model as in stepwise regression. Such inference is made upon the assumption that the selected model is true. However, without this assumption, one would not know the validity of the inference. Similar problems also exist in high dimensional regression with regularization. To address these problems, we propose a dimension-reduced generalized likelihood ratio test for generalized linear models with nonpolynomial dimensionality, based on the quasilikelihood estimation which allows for misspecification of the conditional variance. The test has nearly oracle performance when using the correct amount of shrinkage and has robust performance against the choice of regularization parameter across a large range. We further develop an adaptive data-driven dimension-reduced generalized likelihood ratio test and prove that with probability going to one it is an oracle generalized likelihood ratio test. However, in ultrahigh-dimensional models the penalized estimation may produce spuriously important variables which deteriorate the performance of test. To tackle this problem, we introduce a cross-fitted dimension-reduced generalized likelihood ratio test, which is not only free of spurious effects but robust against the choice of regularization parameter. We establish limiting distributions of the proposed tests. Their advantages are highlighted via theoretical and empirical comparisons to some competitive tests. An application to breast cancer data illustrates the use of our proposed methodology.","PeriodicalId":9001,"journal":{"name":"Biometrika","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Inference for possibly misspecified generalized linear models with nonpolynomial-dimensional nuisance parameters\",\"authors\":\"Shaoxin Hong, Jiancheng Jiang, Xuejun Jiang, Haofeng Wang\",\"doi\":\"10.1093/biomet/asae024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n It is routine practice in statistical modelling to first select variables and then make inference for the selected model as in stepwise regression. Such inference is made upon the assumption that the selected model is true. However, without this assumption, one would not know the validity of the inference. Similar problems also exist in high dimensional regression with regularization. To address these problems, we propose a dimension-reduced generalized likelihood ratio test for generalized linear models with nonpolynomial dimensionality, based on the quasilikelihood estimation which allows for misspecification of the conditional variance. The test has nearly oracle performance when using the correct amount of shrinkage and has robust performance against the choice of regularization parameter across a large range. We further develop an adaptive data-driven dimension-reduced generalized likelihood ratio test and prove that with probability going to one it is an oracle generalized likelihood ratio test. However, in ultrahigh-dimensional models the penalized estimation may produce spuriously important variables which deteriorate the performance of test. To tackle this problem, we introduce a cross-fitted dimension-reduced generalized likelihood ratio test, which is not only free of spurious effects but robust against the choice of regularization parameter. We establish limiting distributions of the proposed tests. Their advantages are highlighted via theoretical and empirical comparisons to some competitive tests. An application to breast cancer data illustrates the use of our proposed methodology.\",\"PeriodicalId\":9001,\"journal\":{\"name\":\"Biometrika\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.4000,\"publicationDate\":\"2024-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biometrika\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://doi.org/10.1093/biomet/asae024\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biometrika","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1093/biomet/asae024","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

统计建模的常规做法是先选择变量,然后对所选模型进行推理,如逐步回归法。这种推论是在假设所选模型为真的基础上进行的。然而,如果没有这个假设,我们就无法知道推论的有效性。带正则化的高维回归也存在类似的问题。为了解决这些问题,我们针对非多项式维度的广义线性模型提出了一种降维的广义似然比检验,它基于准似然估计,允许条件方差的错误规范。当使用正确的收缩量时,该检验具有近乎神谕的性能,并且在很大范围内对正则化参数的选择具有稳健的性能。我们进一步开发了一种自适应数据驱动的降维广义似然比检验,并证明它是一种概率为 1 的神谕广义似然比检验。然而,在超高维模型中,惩罚估计可能会产生虚假的重要变量,从而降低检验的性能。为了解决这个问题,我们引入了一种交叉拟合的降维广义似然比检验,它不仅没有虚假效应,而且对正则化参数的选择具有鲁棒性。我们建立了拟议检验的极限分布。通过与一些有竞争力的检验方法进行理论和实证比较,凸显了它们的优势。对乳腺癌数据的应用说明了我们提出的方法的用途。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Inference for possibly misspecified generalized linear models with nonpolynomial-dimensional nuisance parameters
It is routine practice in statistical modelling to first select variables and then make inference for the selected model as in stepwise regression. Such inference is made upon the assumption that the selected model is true. However, without this assumption, one would not know the validity of the inference. Similar problems also exist in high dimensional regression with regularization. To address these problems, we propose a dimension-reduced generalized likelihood ratio test for generalized linear models with nonpolynomial dimensionality, based on the quasilikelihood estimation which allows for misspecification of the conditional variance. The test has nearly oracle performance when using the correct amount of shrinkage and has robust performance against the choice of regularization parameter across a large range. We further develop an adaptive data-driven dimension-reduced generalized likelihood ratio test and prove that with probability going to one it is an oracle generalized likelihood ratio test. However, in ultrahigh-dimensional models the penalized estimation may produce spuriously important variables which deteriorate the performance of test. To tackle this problem, we introduce a cross-fitted dimension-reduced generalized likelihood ratio test, which is not only free of spurious effects but robust against the choice of regularization parameter. We establish limiting distributions of the proposed tests. Their advantages are highlighted via theoretical and empirical comparisons to some competitive tests. An application to breast cancer data illustrates the use of our proposed methodology.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Biometrika
Biometrika 生物-生物学
CiteScore
5.50
自引率
3.70%
发文量
56
审稿时长
6-12 weeks
期刊介绍: Biometrika is primarily a journal of statistics in which emphasis is placed on papers containing original theoretical contributions of direct or potential value in applications. From time to time, papers in bordering fields are also published.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信