{"title":"Ensemble Learning based on CNN and Transformer Models for Leaf Diseases Classification","authors":"Li-Hua Li, Radius Tanone","doi":"10.1109/IMCOM60618.2024.10418393","DOIUrl":null,"url":null,"abstract":"Symptoms on the leaves are often the first indication of a plant disease. In order not to affect the process of crop production, farmers need to identify plant diseases on their leaves as quickly as possible. This problem has long been addressed by a variety of computational techniques, such as deep learning models. Today, many specialized deep learning models are built using Transformer or Convolution Neural Networks (CNN). However, the accuracy and performance of individual deep learning models depends on many factors, such as the number of parameters, training time, and the dataset used. Often a single model is not well suited to solving problems such as image classification of leaf diseases. This study proposes an ensemble learning based on CNN and Transformer models. The models used in this study are MobileNetV3, DenseNet201, ResNext50, Vision Transformer and Swin Transformer. The purpose of ensemble learning with these five models is to achieve accuracy and good performance through weighted voting such as hard voting and soft voting. The experimental findings indicate that the utilization of ensemble learning, employing a combination of five models, yields enhanced accuracy and performance in the classification of three distinct types of datasets: corn leaf diseases, grape leaf diseases, and potato leaf diseases. Our experiment also showed that the Vision Transformer model has higher accuracy compared to other models. To perform a detailed analysis, we use the Grad-CAM technique to visualize how all models use the gradient to create a classification score. The results of this experiment can be a recommendation for the agricultural sector so that they can be implemented as early as possible to address the problem of leaf diseases classification.","PeriodicalId":518057,"journal":{"name":"2024 18th International Conference on Ubiquitous Information Management and Communication (IMCOM)","volume":"64 2","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2024 18th International Conference on Ubiquitous Information Management and Communication (IMCOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IMCOM60618.2024.10418393","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Symptoms on the leaves are often the first indication of a plant disease. In order not to affect the process of crop production, farmers need to identify plant diseases on their leaves as quickly as possible. This problem has long been addressed by a variety of computational techniques, such as deep learning models. Today, many specialized deep learning models are built using Transformer or Convolution Neural Networks (CNN). However, the accuracy and performance of individual deep learning models depends on many factors, such as the number of parameters, training time, and the dataset used. Often a single model is not well suited to solving problems such as image classification of leaf diseases. This study proposes an ensemble learning based on CNN and Transformer models. The models used in this study are MobileNetV3, DenseNet201, ResNext50, Vision Transformer and Swin Transformer. The purpose of ensemble learning with these five models is to achieve accuracy and good performance through weighted voting such as hard voting and soft voting. The experimental findings indicate that the utilization of ensemble learning, employing a combination of five models, yields enhanced accuracy and performance in the classification of three distinct types of datasets: corn leaf diseases, grape leaf diseases, and potato leaf diseases. Our experiment also showed that the Vision Transformer model has higher accuracy compared to other models. To perform a detailed analysis, we use the Grad-CAM technique to visualize how all models use the gradient to create a classification score. The results of this experiment can be a recommendation for the agricultural sector so that they can be implemented as early as possible to address the problem of leaf diseases classification.