Ricardo Xavier Llugsi Cañar, S. El Yacoubi, Allyx Fontaine, P. Lupera
{"title":"一种与解耦权衰减(AdamL)相关的新型Adam方法","authors":"Ricardo Xavier Llugsi Cañar, S. El Yacoubi, Allyx Fontaine, P. Lupera","doi":"10.1109/LA-CCI48322.2021.9769816","DOIUrl":null,"url":null,"abstract":"The use of optimizers makes it possible to reduce losses during the learning process of a neural network. Currently there are some types of optimizers whose effectiveness has already been proven, an example of this is Adam. Adam is an extension to Stochastic Gradient Decent that makes use of Momentum and Adaptive Learning to converge faster. An interesting alternative to complement the Adam’s work is the addition of weight decay. This is done to decouple the weight decay from the gradient-based update. Some attempts have been developed previously, however its correct operation has not been keenly proven. In this work, a weight decay decoupling alternative is presented and acutely analyzed. The algorithm’s convergence is mathematically verified and its operation too through the use of a Convolutional Encoder-Decoder network and the application of strategies for error reduction. The AdamL operation is verified by the achievement of a proper Temperature Forecast with a percentage error lower than 4.5%. It can be seen too that the forecast error deepens around noon but it does not exceed 1.47°C.","PeriodicalId":431041,"journal":{"name":"2021 IEEE Latin American Conference on Computational Intelligence (LA-CCI)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A novel Adam approach related to Decoupled Weight Decay (AdamL)\",\"authors\":\"Ricardo Xavier Llugsi Cañar, S. El Yacoubi, Allyx Fontaine, P. Lupera\",\"doi\":\"10.1109/LA-CCI48322.2021.9769816\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The use of optimizers makes it possible to reduce losses during the learning process of a neural network. Currently there are some types of optimizers whose effectiveness has already been proven, an example of this is Adam. Adam is an extension to Stochastic Gradient Decent that makes use of Momentum and Adaptive Learning to converge faster. An interesting alternative to complement the Adam’s work is the addition of weight decay. This is done to decouple the weight decay from the gradient-based update. Some attempts have been developed previously, however its correct operation has not been keenly proven. In this work, a weight decay decoupling alternative is presented and acutely analyzed. The algorithm’s convergence is mathematically verified and its operation too through the use of a Convolutional Encoder-Decoder network and the application of strategies for error reduction. The AdamL operation is verified by the achievement of a proper Temperature Forecast with a percentage error lower than 4.5%. It can be seen too that the forecast error deepens around noon but it does not exceed 1.47°C.\",\"PeriodicalId\":431041,\"journal\":{\"name\":\"2021 IEEE Latin American Conference on Computational Intelligence (LA-CCI)\",\"volume\":\"56 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE Latin American Conference on Computational Intelligence (LA-CCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/LA-CCI48322.2021.9769816\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Latin American Conference on Computational Intelligence (LA-CCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/LA-CCI48322.2021.9769816","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A novel Adam approach related to Decoupled Weight Decay (AdamL)
The use of optimizers makes it possible to reduce losses during the learning process of a neural network. Currently there are some types of optimizers whose effectiveness has already been proven, an example of this is Adam. Adam is an extension to Stochastic Gradient Decent that makes use of Momentum and Adaptive Learning to converge faster. An interesting alternative to complement the Adam’s work is the addition of weight decay. This is done to decouple the weight decay from the gradient-based update. Some attempts have been developed previously, however its correct operation has not been keenly proven. In this work, a weight decay decoupling alternative is presented and acutely analyzed. The algorithm’s convergence is mathematically verified and its operation too through the use of a Convolutional Encoder-Decoder network and the application of strategies for error reduction. The AdamL operation is verified by the achievement of a proper Temperature Forecast with a percentage error lower than 4.5%. It can be seen too that the forecast error deepens around noon but it does not exceed 1.47°C.