Simulation-calibration testing for inference in Lasso regressions

Matthieu Pluntz, Cyril Dalmasso, Pascale Tubert-Bitter, Ismail Ahmed
{"title":"Simulation-calibration testing for inference in Lasso regressions","authors":"Matthieu Pluntz, Cyril Dalmasso, Pascale Tubert-Bitter, Ismail Ahmed","doi":"arxiv-2409.02269","DOIUrl":null,"url":null,"abstract":"We propose a test of the significance of a variable appearing on the Lasso\npath and use it in a procedure for selecting one of the models of the Lasso\npath, controlling the Family-Wise Error Rate. Our null hypothesis depends on a\nset A of already selected variables and states that it contains all the active\nvariables. We focus on the regularization parameter value from which a first\nvariable outside A is selected. As the test statistic, we use this quantity's\nconditional p-value, which we define conditional on the non-penalized estimated\ncoefficients of the model restricted to A. We estimate this by simulating\noutcome vectors and then calibrating them on the observed outcome's estimated\ncoefficients. We adapt the calibration heuristically to the case of generalized\nlinear models in which it turns into an iterative stochastic procedure. We\nprove that the test controls the risk of selecting a false positive in linear\nmodels, both under the null hypothesis and, under a correlation condition, when\nA does not contain all active variables. We assess the performance of our\nprocedure through extensive simulation studies. We also illustrate it in the\ndetection of exposures associated with drug-induced liver injuries in the\nFrench pharmacovigilance database.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"61 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - STAT - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.02269","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We propose a test of the significance of a variable appearing on the Lasso path and use it in a procedure for selecting one of the models of the Lasso path, controlling the Family-Wise Error Rate. Our null hypothesis depends on a set A of already selected variables and states that it contains all the active variables. We focus on the regularization parameter value from which a first variable outside A is selected. As the test statistic, we use this quantity's conditional p-value, which we define conditional on the non-penalized estimated coefficients of the model restricted to A. We estimate this by simulating outcome vectors and then calibrating them on the observed outcome's estimated coefficients. We adapt the calibration heuristically to the case of generalized linear models in which it turns into an iterative stochastic procedure. We prove that the test controls the risk of selecting a false positive in linear models, both under the null hypothesis and, under a correlation condition, when A does not contain all active variables. We assess the performance of our procedure through extensive simulation studies. We also illustrate it in the detection of exposures associated with drug-induced liver injuries in the French pharmacovigilance database.
Lasso 回归推理的模拟校准测试
我们建议对拉索帕斯中出现的变量进行重要性检验,并将其用于选择拉索帕斯模型之一的程序中,同时控制全族平均误差率(Family-Wise Error Rate)。我们的零假设取决于已选定变量的集合 A,并指出它包含所有有效变量。我们的重点是正则化参数值,从中选出 A 以外的第一个变量。作为检验统计量,我们使用这个量的条件 p 值,它是以限制在 A 中的模型的非惩罚估计系数为条件定义的。我们通过模拟结果向量,然后根据观察结果的估计系数进行校准来估计这个值。我们将校准启发式地应用于广义线性模型的情况,在这种情况下,校准变成了一个迭代随机过程。我们证明,无论是在零假设下,还是在相关条件下,当 A 不包含所有活动变量时,该检验都能控制线性模型中选择假阳性的风险。我们通过大量的模拟研究评估了我们程序的性能。我们还以法国药物警戒数据库中与药物引起的肝损伤相关的暴露检测为例进行了说明。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信