Syed Mahbubuz Zaman, M. Hasan, Redwan Islam Sakline, Dipto Das, Md. Ashraful Alam
{"title":"A Comparative Analysis of Optimizers in Recurrent Neural Networks for Text Classification","authors":"Syed Mahbubuz Zaman, M. Hasan, Redwan Islam Sakline, Dipto Das, Md. Ashraful Alam","doi":"10.1109/CSDE53843.2021.9718394","DOIUrl":null,"url":null,"abstract":"The performance of any deep learning model depends heavily on the choice of optimizers and their corresponding hyper-parameters. For any given problem researchers struggle to select the best possible optimizer from a myriad of optimizers proposed in existing literature. Currently the process of optimizer selection in practice is anecdotal at best whereby practitioners either randomly select an optimizer or rely on best practices or online recommendations not grounded on empirical evidence base. In our paper, we delve deep into this problem of picking the right optimizer for text based datasets and linguistic classification problems, by bench-marking ten optimizers on three different RNN models (Bi-GRU, Bi-LSTM and BRNN) on three spam email based benchmark datasets. We analyse the performance of models employing these optimizers using train accuracy, train loss, validation accuracy, validation loss, test accuracy, test loss and RO-AUC score as metrics. The results show that Adaptive Optimization methods (RMSprop, Adam, Adam weight decay and Nadam) with default hyper-parameters outperform other optimizers in all three datasets and RNN model variations.","PeriodicalId":166950,"journal":{"name":"2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSDE53843.2021.9718394","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
The performance of any deep learning model depends heavily on the choice of optimizers and their corresponding hyper-parameters. For any given problem researchers struggle to select the best possible optimizer from a myriad of optimizers proposed in existing literature. Currently the process of optimizer selection in practice is anecdotal at best whereby practitioners either randomly select an optimizer or rely on best practices or online recommendations not grounded on empirical evidence base. In our paper, we delve deep into this problem of picking the right optimizer for text based datasets and linguistic classification problems, by bench-marking ten optimizers on three different RNN models (Bi-GRU, Bi-LSTM and BRNN) on three spam email based benchmark datasets. We analyse the performance of models employing these optimizers using train accuracy, train loss, validation accuracy, validation loss, test accuracy, test loss and RO-AUC score as metrics. The results show that Adaptive Optimization methods (RMSprop, Adam, Adam weight decay and Nadam) with default hyper-parameters outperform other optimizers in all three datasets and RNN model variations.