Regularized Least Absolute Deviations Regression and an Efficient Algorithm for Parameter Tuning

Li Wang, Michael D. Gordon, Ji Zhu
{"title":"Regularized Least Absolute Deviations Regression and an Efficient Algorithm for Parameter Tuning","authors":"Li Wang, Michael D. Gordon, Ji Zhu","doi":"10.1109/ICDM.2006.134","DOIUrl":null,"url":null,"abstract":"Linear regression is one of the most important and widely used techniques for data analysis. However, sometimes people are not satisfied with it because of the following two limitations: 1) its results are sensitive to outliers, so when the error terms are not normally distributed, especially when they have heavy-tailed distributions, linear regression often works badly; 2) its estimated coefficients tend to have high variance, although their bias is low. To reduce the influence of outliers, robust regression models were developed. Least absolute deviation (LAD) regression is one of them. LAD minimizes the mean absolute errors, instead of mean squared errors, so its results are more robust. To address the second limitation, shrinkage methods were proposed, which add a penalty on the size of the coefficients. The LASSO is one of these methods and it uses the L1-norm penalty, which not only reduces the prediction error and the variance of estimated coefficients, but also provides an automatic feature selection function. In this paper, we propose the regularized least absolute deviation (RLAD) regression model, which combines the nice features of the LAD and the LASSO together. The RLAD is a regularization method, whose objective function has the form of \"loss + penalty.\" The \"loss\" is the sum of the absolute deviations and the \"penalty\" is the L1-norm of the coefficient vector. Furthermore, to facilitate parameter tuning, we develop an efficient algorithm which can solve the entire regularization path in one pass. Simulations with various settings are performed to demonstrate its performance. Finally, we apply the algorithm to solve the image reconstruction problem and find interesting results.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"95","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sixth International Conference on Data Mining (ICDM'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2006.134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 95

Abstract

Linear regression is one of the most important and widely used techniques for data analysis. However, sometimes people are not satisfied with it because of the following two limitations: 1) its results are sensitive to outliers, so when the error terms are not normally distributed, especially when they have heavy-tailed distributions, linear regression often works badly; 2) its estimated coefficients tend to have high variance, although their bias is low. To reduce the influence of outliers, robust regression models were developed. Least absolute deviation (LAD) regression is one of them. LAD minimizes the mean absolute errors, instead of mean squared errors, so its results are more robust. To address the second limitation, shrinkage methods were proposed, which add a penalty on the size of the coefficients. The LASSO is one of these methods and it uses the L1-norm penalty, which not only reduces the prediction error and the variance of estimated coefficients, but also provides an automatic feature selection function. In this paper, we propose the regularized least absolute deviation (RLAD) regression model, which combines the nice features of the LAD and the LASSO together. The RLAD is a regularization method, whose objective function has the form of "loss + penalty." The "loss" is the sum of the absolute deviations and the "penalty" is the L1-norm of the coefficient vector. Furthermore, to facilitate parameter tuning, we develop an efficient algorithm which can solve the entire regularization path in one pass. Simulations with various settings are performed to demonstrate its performance. Finally, we apply the algorithm to solve the image reconstruction problem and find interesting results.
正则化最小绝对偏差回归及参数整定的有效算法
线性回归是数据分析中最重要和应用最广泛的技术之一。然而,有时人们对它并不满意,因为它有以下两个局限性:1)它的结果对异常值很敏感,所以当误差项不是正态分布时,特别是当它们具有重尾分布时,线性回归往往效果不佳;2)其估计系数往往具有高方差,尽管它们的偏差很低。为了减少异常值的影响,建立了稳健的回归模型。最小绝对偏差(LAD)回归就是其中之一。LAD最小化的是平均绝对误差,而不是均方误差,因此它的结果更稳健。为了解决第二个限制,提出了收缩方法,这增加了对系数大小的惩罚。LASSO就是其中的一种方法,它使用l1范数惩罚,不仅减少了预测误差和估计系数的方差,而且提供了一个自动的特征选择功能。本文提出了正则化最小绝对偏差(RLAD)回归模型,该模型结合了正则化最小绝对偏差和LASSO的优点。RLAD是一种正则化方法,其目标函数具有“损失+惩罚”的形式。“损失”是绝对偏差的总和,“惩罚”是系数向量的l1范数。此外,为了方便参数调整,我们开发了一种有效的算法,可以一次求解整个正则化路径。通过不同设置的仿真来验证其性能。最后,我们将该算法应用于图像重建问题,得到了一些有趣的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信