{"title":"LASSO 回归解的数值特性","authors":"Mayur V. Lakshmi , Joab R. Winkler","doi":"10.1016/j.apnum.2024.03.010","DOIUrl":null,"url":null,"abstract":"<div><div>The determination of a concise model of a linear system when there are fewer samples <em>m</em> than predictors <em>n</em> requires the solution of the equation <span><math><mi>A</mi><mi>x</mi><mo>=</mo><mi>b</mi></math></span>, where <span><math><mi>A</mi><mo>∈</mo><msup><mrow><mi>R</mi></mrow><mrow><mi>m</mi><mo>×</mo><mi>n</mi></mrow></msup></math></span> and <span><math><mtext>rank</mtext><mspace></mspace><mi>A</mi><mo>=</mo><mi>m</mi></math></span>, such that the selected solution from the infinite number of solutions is sparse, that is, many of its components are zero. This leads to the minimisation with respect to <em>x</em> of <span><math><mi>f</mi><mo>(</mo><mi>x</mi><mo>,</mo><mi>λ</mi><mo>)</mo><mo>=</mo><msubsup><mrow><mo>‖</mo><mi>A</mi><mi>x</mi><mo>−</mo><mi>b</mi><mo>‖</mo></mrow><mrow><mn>2</mn></mrow><mrow><mn>2</mn></mrow></msubsup><mo>+</mo><mi>λ</mi><msub><mrow><mo>‖</mo><mi>x</mi><mo>‖</mo></mrow><mrow><mn>1</mn></mrow></msub></math></span>, where <em>λ</em> is the regularisation parameter. This problem, which is called LASSO regression, yields a family of functions <span><math><msub><mrow><mi>x</mi></mrow><mrow><mtext>lasso</mtext></mrow></msub><mo>(</mo><mi>λ</mi><mo>)</mo></math></span> and it is necessary to determine the optimal value of <em>λ</em>, that is, the value of <em>λ</em> that balances the fidelity of the model, <span><math><mrow><mo>‖</mo><mi>A</mi><msub><mrow><mi>x</mi></mrow><mrow><mtext>lasso</mtext></mrow></msub><mo>(</mo><mi>λ</mi><mo>)</mo><mo>−</mo><mi>b</mi><mo>‖</mo></mrow><mo>≈</mo><mn>0</mn></math></span>, and the satisfaction of the constraint that <span><math><msub><mrow><mi>x</mi></mrow><mrow><mtext>lasso</mtext></mrow></msub><mo>(</mo><mi>λ</mi><mo>)</mo></math></span> be sparse. The aim of this paper is an investigation of the numerical properties of <span><math><msub><mrow><mi>x</mi></mrow><mrow><mtext>lasso</mtext></mrow></msub><mo>(</mo><mi>λ</mi><mo>)</mo></math></span>, and the main conclusion of this investigation is the incompatibility of sparsity and stability, that is, a sparse solution <span><math><msub><mrow><mi>x</mi></mrow><mrow><mtext>lasso</mtext></mrow></msub><mo>(</mo><mi>λ</mi><mo>)</mo></math></span> that preserves the fidelity of the model exists if the least squares (LS) solution <span><math><msub><mrow><mi>x</mi></mrow><mrow><mtext>ls</mtext></mrow></msub><mo>=</mo><msup><mrow><mi>A</mi></mrow><mrow><mi>†</mi></mrow></msup><mi>b</mi></math></span> is unstable. Two methods, cross validation and the L-curve, for the computation of the optimal value of <em>λ</em> are compared and it is shown that the L-curve yields significantly better results. This difference between stable and unstable solutions <span><math><msub><mrow><mi>x</mi></mrow><mrow><mtext>ls</mtext></mrow></msub></math></span> of the LS problem manifests itself in the very different forms of the L-curve for these two solutions. The paper includes examples of stable and unstable solutions <span><math><msub><mrow><mi>x</mi></mrow><mrow><mtext>ls</mtext></mrow></msub></math></span> that demonstrate the theory.</div></div>","PeriodicalId":8199,"journal":{"name":"Applied Numerical Mathematics","volume":"208 ","pages":"Pages 297-309"},"PeriodicalIF":2.2000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Numerical properties of solutions of LASSO regression\",\"authors\":\"Mayur V. Lakshmi , Joab R. Winkler\",\"doi\":\"10.1016/j.apnum.2024.03.010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The determination of a concise model of a linear system when there are fewer samples <em>m</em> than predictors <em>n</em> requires the solution of the equation <span><math><mi>A</mi><mi>x</mi><mo>=</mo><mi>b</mi></math></span>, where <span><math><mi>A</mi><mo>∈</mo><msup><mrow><mi>R</mi></mrow><mrow><mi>m</mi><mo>×</mo><mi>n</mi></mrow></msup></math></span> and <span><math><mtext>rank</mtext><mspace></mspace><mi>A</mi><mo>=</mo><mi>m</mi></math></span>, such that the selected solution from the infinite number of solutions is sparse, that is, many of its components are zero. This leads to the minimisation with respect to <em>x</em> of <span><math><mi>f</mi><mo>(</mo><mi>x</mi><mo>,</mo><mi>λ</mi><mo>)</mo><mo>=</mo><msubsup><mrow><mo>‖</mo><mi>A</mi><mi>x</mi><mo>−</mo><mi>b</mi><mo>‖</mo></mrow><mrow><mn>2</mn></mrow><mrow><mn>2</mn></mrow></msubsup><mo>+</mo><mi>λ</mi><msub><mrow><mo>‖</mo><mi>x</mi><mo>‖</mo></mrow><mrow><mn>1</mn></mrow></msub></math></span>, where <em>λ</em> is the regularisation parameter. This problem, which is called LASSO regression, yields a family of functions <span><math><msub><mrow><mi>x</mi></mrow><mrow><mtext>lasso</mtext></mrow></msub><mo>(</mo><mi>λ</mi><mo>)</mo></math></span> and it is necessary to determine the optimal value of <em>λ</em>, that is, the value of <em>λ</em> that balances the fidelity of the model, <span><math><mrow><mo>‖</mo><mi>A</mi><msub><mrow><mi>x</mi></mrow><mrow><mtext>lasso</mtext></mrow></msub><mo>(</mo><mi>λ</mi><mo>)</mo><mo>−</mo><mi>b</mi><mo>‖</mo></mrow><mo>≈</mo><mn>0</mn></math></span>, and the satisfaction of the constraint that <span><math><msub><mrow><mi>x</mi></mrow><mrow><mtext>lasso</mtext></mrow></msub><mo>(</mo><mi>λ</mi><mo>)</mo></math></span> be sparse. The aim of this paper is an investigation of the numerical properties of <span><math><msub><mrow><mi>x</mi></mrow><mrow><mtext>lasso</mtext></mrow></msub><mo>(</mo><mi>λ</mi><mo>)</mo></math></span>, and the main conclusion of this investigation is the incompatibility of sparsity and stability, that is, a sparse solution <span><math><msub><mrow><mi>x</mi></mrow><mrow><mtext>lasso</mtext></mrow></msub><mo>(</mo><mi>λ</mi><mo>)</mo></math></span> that preserves the fidelity of the model exists if the least squares (LS) solution <span><math><msub><mrow><mi>x</mi></mrow><mrow><mtext>ls</mtext></mrow></msub><mo>=</mo><msup><mrow><mi>A</mi></mrow><mrow><mi>†</mi></mrow></msup><mi>b</mi></math></span> is unstable. Two methods, cross validation and the L-curve, for the computation of the optimal value of <em>λ</em> are compared and it is shown that the L-curve yields significantly better results. This difference between stable and unstable solutions <span><math><msub><mrow><mi>x</mi></mrow><mrow><mtext>ls</mtext></mrow></msub></math></span> of the LS problem manifests itself in the very different forms of the L-curve for these two solutions. The paper includes examples of stable and unstable solutions <span><math><msub><mrow><mi>x</mi></mrow><mrow><mtext>ls</mtext></mrow></msub></math></span> that demonstrate the theory.</div></div>\",\"PeriodicalId\":8199,\"journal\":{\"name\":\"Applied Numerical Mathematics\",\"volume\":\"208 \",\"pages\":\"Pages 297-309\"},\"PeriodicalIF\":2.2000,\"publicationDate\":\"2025-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Numerical Mathematics\",\"FirstCategoryId\":\"100\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0168927424000576\",\"RegionNum\":2,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICS, APPLIED\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Numerical Mathematics","FirstCategoryId":"100","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0168927424000576","RegionNum":2,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
Numerical properties of solutions of LASSO regression
The determination of a concise model of a linear system when there are fewer samples m than predictors n requires the solution of the equation , where and , such that the selected solution from the infinite number of solutions is sparse, that is, many of its components are zero. This leads to the minimisation with respect to x of , where λ is the regularisation parameter. This problem, which is called LASSO regression, yields a family of functions and it is necessary to determine the optimal value of λ, that is, the value of λ that balances the fidelity of the model, , and the satisfaction of the constraint that be sparse. The aim of this paper is an investigation of the numerical properties of , and the main conclusion of this investigation is the incompatibility of sparsity and stability, that is, a sparse solution that preserves the fidelity of the model exists if the least squares (LS) solution is unstable. Two methods, cross validation and the L-curve, for the computation of the optimal value of λ are compared and it is shown that the L-curve yields significantly better results. This difference between stable and unstable solutions of the LS problem manifests itself in the very different forms of the L-curve for these two solutions. The paper includes examples of stable and unstable solutions that demonstrate the theory.
期刊介绍:
The purpose of the journal is to provide a forum for the publication of high quality research and tutorial papers in computational mathematics. In addition to the traditional issues and problems in numerical analysis, the journal also publishes papers describing relevant applications in such fields as physics, fluid dynamics, engineering and other branches of applied science with a computational mathematics component. The journal strives to be flexible in the type of papers it publishes and their format. Equally desirable are:
(i) Full papers, which should be complete and relatively self-contained original contributions with an introduction that can be understood by the broad computational mathematics community. Both rigorous and heuristic styles are acceptable. Of particular interest are papers about new areas of research, in which other than strictly mathematical arguments may be important in establishing a basis for further developments.
(ii) Tutorial review papers, covering some of the important issues in Numerical Mathematics, Scientific Computing and their Applications. The journal will occasionally publish contributions which are larger than the usual format for regular papers.
(iii) Short notes, which present specific new results and techniques in a brief communication.