{"title":"加速梯度方法的状态空间视角:那达慕、RAdam和重尺度梯度流","authors":"Kushal Chakrabarti, N. Chopra","doi":"10.1109/ICC56513.2022.10093397","DOIUrl":null,"url":null,"abstract":"Fast gradient-descent algorithms are the default practice in training complex machine learning models. This paper presents the convergence guarantee of two existing adaptive gradient algorithms, Nadam and RAdam, for the first time, and the rescaled gradient flow in solving non-convex optimization. The analyses of all three algorithms are unified by a common underlying proof sketch, relying upon Barbalat's lemma. The utility of another tool from classical control, the transfer function, hitherto used to propose a new variant of the famous Adam optimizer, is extended in this paper for developing an improved variant of the Nadam algorithm. Our experimental results validate the efficiency of this proposed algorithm for solving benchmark machine learning problems in a shorter time and with enhanced accuracy.","PeriodicalId":101654,"journal":{"name":"2022 Eighth Indian Control Conference (ICC)","volume":"321 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A State-Space Perspective on the Expedited Gradient Methods: Nadam, RAdam, and Rescaled Gradient Flow\",\"authors\":\"Kushal Chakrabarti, N. Chopra\",\"doi\":\"10.1109/ICC56513.2022.10093397\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Fast gradient-descent algorithms are the default practice in training complex machine learning models. This paper presents the convergence guarantee of two existing adaptive gradient algorithms, Nadam and RAdam, for the first time, and the rescaled gradient flow in solving non-convex optimization. The analyses of all three algorithms are unified by a common underlying proof sketch, relying upon Barbalat's lemma. The utility of another tool from classical control, the transfer function, hitherto used to propose a new variant of the famous Adam optimizer, is extended in this paper for developing an improved variant of the Nadam algorithm. Our experimental results validate the efficiency of this proposed algorithm for solving benchmark machine learning problems in a shorter time and with enhanced accuracy.\",\"PeriodicalId\":101654,\"journal\":{\"name\":\"2022 Eighth Indian Control Conference (ICC)\",\"volume\":\"321 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 Eighth Indian Control Conference (ICC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICC56513.2022.10093397\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Eighth Indian Control Conference (ICC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICC56513.2022.10093397","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A State-Space Perspective on the Expedited Gradient Methods: Nadam, RAdam, and Rescaled Gradient Flow
Fast gradient-descent algorithms are the default practice in training complex machine learning models. This paper presents the convergence guarantee of two existing adaptive gradient algorithms, Nadam and RAdam, for the first time, and the rescaled gradient flow in solving non-convex optimization. The analyses of all three algorithms are unified by a common underlying proof sketch, relying upon Barbalat's lemma. The utility of another tool from classical control, the transfer function, hitherto used to propose a new variant of the famous Adam optimizer, is extended in this paper for developing an improved variant of the Nadam algorithm. Our experimental results validate the efficiency of this proposed algorithm for solving benchmark machine learning problems in a shorter time and with enhanced accuracy.