{"title":"Generalized Hessian approximations via Stein's lemma for constrained minimization","authors":"Murat A. Erdogdu","doi":"10.1109/ITA.2017.8023450","DOIUrl":null,"url":null,"abstract":"We consider the problem of convex constrained minimization of an average of n functions, where the parameter and the features are related through inner products. We focus on second order batch updates, where the curvature matrix is obtained by assuming random design and by applying the celebrated Stein's lemma together with subsampling techniques. The proposed algorithm enjoys fast convergence rates similar to the Newton method, yet the per-iteration cost has the same order of magnitude as the gradient descent. We demonstrate its performance on well-known optimization problems where Stein's lemma is not directly applicable, such as M-estimation for robust statistics, and inequality form linear/quadratic programming etc. Under certain assumptions, we show that the constrained optimization algorithm attains a composite convergence rate that is initially quadratic and asymptotically linear. We validate its performance through widely encountered optimization tasks on several real and synthetic datasets by comparing it to classical optimization algorithms.","PeriodicalId":305510,"journal":{"name":"2017 Information Theory and Applications Workshop (ITA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Information Theory and Applications Workshop (ITA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITA.2017.8023450","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We consider the problem of convex constrained minimization of an average of n functions, where the parameter and the features are related through inner products. We focus on second order batch updates, where the curvature matrix is obtained by assuming random design and by applying the celebrated Stein's lemma together with subsampling techniques. The proposed algorithm enjoys fast convergence rates similar to the Newton method, yet the per-iteration cost has the same order of magnitude as the gradient descent. We demonstrate its performance on well-known optimization problems where Stein's lemma is not directly applicable, such as M-estimation for robust statistics, and inequality form linear/quadratic programming etc. Under certain assumptions, we show that the constrained optimization algorithm attains a composite convergence rate that is initially quadratic and asymptotically linear. We validate its performance through widely encountered optimization tasks on several real and synthetic datasets by comparing it to classical optimization algorithms.