{"title":"LASSO regularization within the LocalGLMnet architecture","authors":"Ronald Richman, Mario V. Wüthrich","doi":"10.1007/s11634-022-00529-z","DOIUrl":null,"url":null,"abstract":"<div><p>Deep learning models have been very successful in the application of machine learning methods, often out-performing classical statistical models such as linear regression models or generalized linear models. On the other hand, deep learning models are often criticized for not being explainable nor allowing for variable selection. There are two different ways of dealing with this problem, either we use post-hoc model interpretability methods or we design specific deep learning architectures that allow for an easier interpretation and explanation. This paper builds on our previous work on the LocalGLMnet architecture that gives an interpretable deep learning architecture. In the present paper, we show how group LASSO regularization (and other regularization schemes) can be implemented within the LocalGLMnet architecture so that we receive feature sparsity for variable selection. We benchmark our approach with the recently developed LassoNet of Lemhadri et al. ( LassoNet: a neural network with feature sparsity. J Mach Learn Res 22:1–29, 2021).</p></div>","PeriodicalId":49270,"journal":{"name":"Advances in Data Analysis and Classification","volume":"17 4","pages":"951 - 981"},"PeriodicalIF":1.4000,"publicationDate":"2022-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Data Analysis and Classification","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s11634-022-00529-z","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 0
Abstract
Deep learning models have been very successful in the application of machine learning methods, often out-performing classical statistical models such as linear regression models or generalized linear models. On the other hand, deep learning models are often criticized for not being explainable nor allowing for variable selection. There are two different ways of dealing with this problem, either we use post-hoc model interpretability methods or we design specific deep learning architectures that allow for an easier interpretation and explanation. This paper builds on our previous work on the LocalGLMnet architecture that gives an interpretable deep learning architecture. In the present paper, we show how group LASSO regularization (and other regularization schemes) can be implemented within the LocalGLMnet architecture so that we receive feature sparsity for variable selection. We benchmark our approach with the recently developed LassoNet of Lemhadri et al. ( LassoNet: a neural network with feature sparsity. J Mach Learn Res 22:1–29, 2021).
深度学习模型在机器学习方法的应用中非常成功,通常优于线性回归模型或广义线性模型等经典统计模型。另一方面,深度学习模型经常被批评为无法解释,也不允许变量选择。有两种不同的方法来处理这个问题,要么我们使用事后模型可解释性方法,要么我们设计特定的深度学习架构,以便更容易地进行解释和解释。本文建立在我们之前关于LocalGLMnet架构的工作之上,该架构提供了一个可解释的深度学习架构。在本文中,我们展示了如何在LocalGLMnet架构中实现组LASSO正则化(和其他正则化方案),以便我们接收用于变量选择的特征稀疏性。我们将我们的方法与Lemhardi等人最近开发的LassoNet进行了比较。(LassoNet:一种具有特征稀疏性的神经网络。J Mach Learn Res 22:1-292021)。
期刊介绍:
The international journal Advances in Data Analysis and Classification (ADAC) is designed as a forum for high standard publications on research and applications concerning the extraction of knowable aspects from many types of data. It publishes articles on such topics as structural, quantitative, or statistical approaches for the analysis of data; advances in classification, clustering, and pattern recognition methods; strategies for modeling complex data and mining large data sets; methods for the extraction of knowledge from data, and applications of advanced methods in specific domains of practice. Articles illustrate how new domain-specific knowledge can be made available from data by skillful use of data analysis methods. The journal also publishes survey papers that outline, and illuminate the basic ideas and techniques of special approaches.