{"title":"视觉变压器的学习速率范围测试","authors":"Rinka Kiriyama, A. Sashima, I. Shimizu","doi":"10.1117/12.2692013","DOIUrl":null,"url":null,"abstract":"The solutions obtained by training the deep neural network are highly dependent on the parameters including the learning rate. Therefore, finding the appropriate settings for training deep neural networks is very important. In particular, it is necessary to find the better settings for SOTA models of Vision Transformer(ViT), whose structure is different from ordinal models. In this paper, we focus on the learning rate to find a better value using the Learning Rate Range Test (LRRT). Through our experiments, we found that the appropriate LR is located where the decrease in loss value stops in the LRRT. In addition, we discuss about the effects of the number of epochs and the LR warm up.","PeriodicalId":361127,"journal":{"name":"International Conference on Images, Signals, and Computing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning rate range test for the vision transformer\",\"authors\":\"Rinka Kiriyama, A. Sashima, I. Shimizu\",\"doi\":\"10.1117/12.2692013\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The solutions obtained by training the deep neural network are highly dependent on the parameters including the learning rate. Therefore, finding the appropriate settings for training deep neural networks is very important. In particular, it is necessary to find the better settings for SOTA models of Vision Transformer(ViT), whose structure is different from ordinal models. In this paper, we focus on the learning rate to find a better value using the Learning Rate Range Test (LRRT). Through our experiments, we found that the appropriate LR is located where the decrease in loss value stops in the LRRT. In addition, we discuss about the effects of the number of epochs and the LR warm up.\",\"PeriodicalId\":361127,\"journal\":{\"name\":\"International Conference on Images, Signals, and Computing\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Images, Signals, and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1117/12.2692013\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Images, Signals, and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2692013","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Learning rate range test for the vision transformer
The solutions obtained by training the deep neural network are highly dependent on the parameters including the learning rate. Therefore, finding the appropriate settings for training deep neural networks is very important. In particular, it is necessary to find the better settings for SOTA models of Vision Transformer(ViT), whose structure is different from ordinal models. In this paper, we focus on the learning rate to find a better value using the Learning Rate Range Test (LRRT). Through our experiments, we found that the appropriate LR is located where the decrease in loss value stops in the LRRT. In addition, we discuss about the effects of the number of epochs and the LR warm up.