{"title":"作为学习率调度器的循环对数退火法","authors":"Philip Naveen","doi":"arxiv-2403.14685","DOIUrl":null,"url":null,"abstract":"A learning rate scheduler is a predefined set of instructions for varying\nsearch stepsizes during model training processes. This paper introduces a new\nlogarithmic method using harsh restarting of step sizes through stochastic\ngradient descent. Cyclical log annealing implements the restart pattern more\naggressively to maybe allow the usage of more greedy algorithms on the online\nconvex optimization framework. The algorithm was tested on the CIFAR-10 image\ndatasets, and seemed to perform analogously with cosine annealing on large\ntransformer-enhanced residual neural networks. Future experiments would involve\ntesting the scheduler in generative adversarial networks and finding the best\nparameters for the scheduler with more experiments.","PeriodicalId":501256,"journal":{"name":"arXiv - CS - Mathematical Software","volume":"34 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cyclical Log Annealing as a Learning Rate Scheduler\",\"authors\":\"Philip Naveen\",\"doi\":\"arxiv-2403.14685\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A learning rate scheduler is a predefined set of instructions for varying\\nsearch stepsizes during model training processes. This paper introduces a new\\nlogarithmic method using harsh restarting of step sizes through stochastic\\ngradient descent. Cyclical log annealing implements the restart pattern more\\naggressively to maybe allow the usage of more greedy algorithms on the online\\nconvex optimization framework. The algorithm was tested on the CIFAR-10 image\\ndatasets, and seemed to perform analogously with cosine annealing on large\\ntransformer-enhanced residual neural networks. Future experiments would involve\\ntesting the scheduler in generative adversarial networks and finding the best\\nparameters for the scheduler with more experiments.\",\"PeriodicalId\":501256,\"journal\":{\"name\":\"arXiv - CS - Mathematical Software\",\"volume\":\"34 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Mathematical Software\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2403.14685\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Mathematical Software","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2403.14685","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cyclical Log Annealing as a Learning Rate Scheduler
A learning rate scheduler is a predefined set of instructions for varying
search stepsizes during model training processes. This paper introduces a new
logarithmic method using harsh restarting of step sizes through stochastic
gradient descent. Cyclical log annealing implements the restart pattern more
aggressively to maybe allow the usage of more greedy algorithms on the online
convex optimization framework. The algorithm was tested on the CIFAR-10 image
datasets, and seemed to perform analogously with cosine annealing on large
transformer-enhanced residual neural networks. Future experiments would involve
testing the scheduler in generative adversarial networks and finding the best
parameters for the scheduler with more experiments.