Improving Prediction Accuracy of Lasso and Ridge Regression as an Alternative to LS Regression to Identify Variable Selection Problems

P. Omer
{"title":"Improving Prediction Accuracy of Lasso and Ridge Regression as an Alternative to LS Regression to Identify Variable Selection Problems","authors":"P. Omer","doi":"10.31972/ticma22.05","DOIUrl":null,"url":null,"abstract":"This paper introduces the Lasso and Ridge Regression methods, which are two popular regularization approaches. The method they give a penalty to the coefficients differs in both of them. L1 Regularization refers to Lasso linear regression, while L2 Regularization refers to Ridge regression. As we all know, regression models serve two main purposes: explanation and prediction of scientific phenomena. Where prediction accuracy will be optimized by balancing each of the bias and variance of predictions, while explanation will be gained by constructing interpretable regression models by variable selection. The penalized regression method, also known as Lasso regression, adds bias to the model's estimates and reduces variance to enhance prediction. Ridge regression, on the other hand, introduces a minor amount of bias in the data to get long-term predictions. In the presence of multicollinearity, both regression methods have been offered as an alternative to the least square approach (LS). Because they deal with multicollinearity, they have the appropriate properties to reduce numerical instability caused by overfitting. As a result, prediction accuracy can be improved. For this study, the Corona virus disease (Covid-19) dataset was used, which has had a significant impact on global life. Particularly in our region (Kurdistan), where life has altered dramatically and many people have succumbed to this deadly sickness. Our data is utilized to analyze the benefits of each of the two regression methods. The results show that the Lasso approach produces more accurate and dependable or reliable results in the presence of multicollinearity than Ridge and LS methods when compared in terms of accuracy of predictions by using NCSS10, EViews 12 and SPSS 25.","PeriodicalId":269628,"journal":{"name":"Proceeding of 3rd International Conference of Mathematics and its Applications","volume":"207 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceeding of 3rd International Conference of Mathematics and its Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31972/ticma22.05","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

This paper introduces the Lasso and Ridge Regression methods, which are two popular regularization approaches. The method they give a penalty to the coefficients differs in both of them. L1 Regularization refers to Lasso linear regression, while L2 Regularization refers to Ridge regression. As we all know, regression models serve two main purposes: explanation and prediction of scientific phenomena. Where prediction accuracy will be optimized by balancing each of the bias and variance of predictions, while explanation will be gained by constructing interpretable regression models by variable selection. The penalized regression method, also known as Lasso regression, adds bias to the model's estimates and reduces variance to enhance prediction. Ridge regression, on the other hand, introduces a minor amount of bias in the data to get long-term predictions. In the presence of multicollinearity, both regression methods have been offered as an alternative to the least square approach (LS). Because they deal with multicollinearity, they have the appropriate properties to reduce numerical instability caused by overfitting. As a result, prediction accuracy can be improved. For this study, the Corona virus disease (Covid-19) dataset was used, which has had a significant impact on global life. Particularly in our region (Kurdistan), where life has altered dramatically and many people have succumbed to this deadly sickness. Our data is utilized to analyze the benefits of each of the two regression methods. The results show that the Lasso approach produces more accurate and dependable or reliable results in the presence of multicollinearity than Ridge and LS methods when compared in terms of accuracy of predictions by using NCSS10, EViews 12 and SPSS 25.
提高Lasso和Ridge回归的预测精度以替代LS回归识别变量选择问题
本文介绍了两种常用的正则化方法Lasso和Ridge回归方法。它们对系数进行惩罚的方法是不同的。L1正则化为Lasso线性回归,L2正则化为Ridge回归。众所周知,回归模型有两个主要目的:解释和预测科学现象。其中通过平衡预测的各偏差和方差来优化预测精度,通过变量选择构建可解释的回归模型来获得解释。惩罚回归方法,也被称为Lasso回归,在模型的估计中增加偏差,并减少方差以增强预测。另一方面,岭回归在数据中引入了少量的偏差,以获得长期预测。在多重共线性的情况下,这两种回归方法都被作为最小二乘方法(LS)的替代方法。由于它们处理多重共线性,因此具有适当的性质,可以减少因过拟合引起的数值不稳定性。因此,可以提高预测精度。在这项研究中,使用了对全球生活产生重大影响的冠状病毒病(Covid-19)数据集。特别是在我们地区(库尔德斯坦),那里的生活发生了巨大变化,许多人死于这种致命的疾病。我们的数据被用来分析两种回归方法的好处。结果表明,在使用NCSS10、EViews 12和SPSS 25进行预测的准确性比较时,Lasso方法在存在多重共线性的情况下产生的结果比Ridge和LS方法更准确、可靠或可靠。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信