{"title":"Soft Clipping Mish - A Novel Activation Function for Deep Learning","authors":"Marina Adriana Mercioni, S. Holban","doi":"10.1109/ICICT52872.2021.00010","DOIUrl":null,"url":null,"abstract":"This study aims to introduce a novel activation function, called Soft Clipping Mish. In other words, it brings improvements in order to increase the performance within the architecture. Its capability was tested on different scenarios using different datasets and using LeNet-5 architecture. So, these different testing conditions strengthen our proposal, the fact also emphasized in the experimental phase. We used such as datasets: MNIST, Fashion-MNIST, CIFAR-10, CIFAR-100 for a classification task, and Beijing PM2.5 dataset for a prediction task to determine the rank of air pollution. We introduced two variants of this function, the first variant being Soft Clipping Mish with a predefined parameter and the second variant being Soft Clipping Mish learnable, the learnable parameter giving us more flexibility into weights updates. This learnable parameter was initialized during the training phase with a value equal to 0.25. Our proposal was inspired by a recent activation function called Mish.","PeriodicalId":359456,"journal":{"name":"2021 4th International Conference on Information and Computer Technologies (ICICT)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 4th International Conference on Information and Computer Technologies (ICICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICT52872.2021.00010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This study aims to introduce a novel activation function, called Soft Clipping Mish. In other words, it brings improvements in order to increase the performance within the architecture. Its capability was tested on different scenarios using different datasets and using LeNet-5 architecture. So, these different testing conditions strengthen our proposal, the fact also emphasized in the experimental phase. We used such as datasets: MNIST, Fashion-MNIST, CIFAR-10, CIFAR-100 for a classification task, and Beijing PM2.5 dataset for a prediction task to determine the rank of air pollution. We introduced two variants of this function, the first variant being Soft Clipping Mish with a predefined parameter and the second variant being Soft Clipping Mish learnable, the learnable parameter giving us more flexibility into weights updates. This learnable parameter was initialized during the training phase with a value equal to 0.25. Our proposal was inspired by a recent activation function called Mish.