Practical Considerations for Variable Screening in the Super Learner.

Brian D Williamson, Drew King, Ying Huang
{"title":"Practical Considerations for Variable Screening in the Super Learner.","authors":"Brian D Williamson, Drew King, Ying Huang","doi":"10.51387/25-nejsds82","DOIUrl":null,"url":null,"abstract":"<p><p>Estimating a prediction function is a fundamental component of many data analyses. The super learner ensemble, a particular implementation of stacking, has desirable theoretical properties and has been used successfully in many applications. Dimension reduction can be accomplished by using variable screening algorithms (screeners), including the lasso, within the ensemble prior to fitting other prediction algorithms. However, the performance of a super learner using the lasso for dimension reduction has not been fully explored in cases where the lasso is known to perform poorly. We provide empirical results that suggest that a diverse set of candidate screeners should be used to protect against poor performance of any one screener, similar to the guidance for choosing a library of prediction algorithms for the super learner. These results are further illustrated through the analysis of HIV-1 antibody data.</p>","PeriodicalId":94360,"journal":{"name":"The New England Journal of Statistics in Data Science","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12462829/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The New England Journal of Statistics in Data Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.51387/25-nejsds82","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Estimating a prediction function is a fundamental component of many data analyses. The super learner ensemble, a particular implementation of stacking, has desirable theoretical properties and has been used successfully in many applications. Dimension reduction can be accomplished by using variable screening algorithms (screeners), including the lasso, within the ensemble prior to fitting other prediction algorithms. However, the performance of a super learner using the lasso for dimension reduction has not been fully explored in cases where the lasso is known to perform poorly. We provide empirical results that suggest that a diverse set of candidate screeners should be used to protect against poor performance of any one screener, similar to the guidance for choosing a library of prediction algorithms for the super learner. These results are further illustrated through the analysis of HIV-1 antibody data.

超级学习者中变量筛选的实际考虑。
估计预测函数是许多数据分析的基本组成部分。超级学习者集成是一种特殊的叠加算法,具有良好的理论性质,并已成功地应用于许多领域。在拟合其他预测算法之前,可以通过在集合中使用包括套索在内的可变筛选算法(筛选器)来实现降维。然而,在已知套索表现不佳的情况下,使用套索进行降维的超级学习者的性能还没有得到充分的探索。我们提供的经验结果表明,应该使用多种候选筛选器来防止任何一个筛选器的性能不佳,类似于为超级学习器选择预测算法库的指导。通过对HIV-1抗体数据的分析,进一步说明了这些结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信