Thinh Pham-Duc, M. Ullah, T. Le-Tien, M. Luong, F. A. Cheikh, Øyvind Nordb⊘
{"title":"Improvement on Mechanics Attention Deep Learning model for Classification Ear-tag of Swine","authors":"Thinh Pham-Duc, M. Ullah, T. Le-Tien, M. Luong, F. A. Cheikh, Øyvind Nordb⊘","doi":"10.1109/NICS56915.2022.10013403","DOIUrl":null,"url":null,"abstract":"Classification is a commonly used task that helps computer to resemble human vision in deep neural network problems. In this paper, we investigated the enhanced attention mechanism to improve the model's accuracy and apply the focal loss to deal with the imbalance of data for the ear-tag classification. Briefly, the combination of spatial-channel attention and the current state-of-the-art Convolution Neural Network (CNN), such as ResNet, DenseNet, and EfficientNet enhances model's efficiency in the provided dataset. Moreover, data augmentations namely rotation, shear, Gaussian noise, cropping, and a set of different augmentations are applied to the training phase in which the focal loss is specifically used instead of the traditional cross-entropy (CE) to avoid data imbalance. The research data presented in this paper was collected at a Norwegian farm and manually annotated. An ablation study relating to the augmentation, backbone model, and attention mechanism has proved the importance of each module in the classification. A detailed analysis on the models and their hyperparameters has provided evidence of a significant improvement in the performance.","PeriodicalId":381028,"journal":{"name":"2022 9th NAFOSTED Conference on Information and Computer Science (NICS)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 9th NAFOSTED Conference on Information and Computer Science (NICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NICS56915.2022.10013403","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Classification is a commonly used task that helps computer to resemble human vision in deep neural network problems. In this paper, we investigated the enhanced attention mechanism to improve the model's accuracy and apply the focal loss to deal with the imbalance of data for the ear-tag classification. Briefly, the combination of spatial-channel attention and the current state-of-the-art Convolution Neural Network (CNN), such as ResNet, DenseNet, and EfficientNet enhances model's efficiency in the provided dataset. Moreover, data augmentations namely rotation, shear, Gaussian noise, cropping, and a set of different augmentations are applied to the training phase in which the focal loss is specifically used instead of the traditional cross-entropy (CE) to avoid data imbalance. The research data presented in this paper was collected at a Norwegian farm and manually annotated. An ablation study relating to the augmentation, backbone model, and attention mechanism has proved the importance of each module in the classification. A detailed analysis on the models and their hyperparameters has provided evidence of a significant improvement in the performance.