{"title":"基于 CIFAR 数据集的变换器和 GoogleNet 模型在图像分类中的对比分析","authors":"Xinran Xie, Qinwen Yan, Haoye Li, Sujie Yan, Zirong Jiang","doi":"10.54254/2755-2721/79/20241537","DOIUrl":null,"url":null,"abstract":"Image classification plays a pivotal role in numerous applications, with substantial implications for daily life, including diagnosing disease from medical images and management of images in autonomous vehicles. However, such sort of research in this field continuously challenges scientists in terms of choosing datasets, testing accuracy, and improvement of models, etc. In this paper, we focus on the performance of two prominent models GoogleNet and residual attention network. We construct two models on the Python platform according to available online resources. To assess their capabilities, we employ the CIFAR-100 dataset, a widely used benchmark dataset. Despite the simplicity of our implementations, GoogleNet comprises approximately 75 convolutional layers and inception modules, and the Residual Attention Network incorporates multiple attention modules within its architecture. These characteristics demonstrate the models' potential for achieving exceptional classification results. Through comprehensive testing and visualization, we aim to provide insights into the efficacy of these models in the context of image classification. Our study contributes to a broader and profounder understanding of their suitability for real-world applications. According to our diagrams and analysis, we conclude that although attention56 is suitable to be adopted in image classification concerning its structure since the model is unstable and invalid in a wide range of training image data on dataset SIFAR100 it might not be exploited in practice. However, as to the model GoogleNet, with an increasing number of training, it obviously is prone to robustness and solid capability of noise resistance. Therefore, GoogleNet is a suitable one to be employed in image classification.","PeriodicalId":502253,"journal":{"name":"Applied and Computational Engineering","volume":"30 8","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparative analysis of transformer and GoogleNet models in image classification based on the CIFAR dataset\",\"authors\":\"Xinran Xie, Qinwen Yan, Haoye Li, Sujie Yan, Zirong Jiang\",\"doi\":\"10.54254/2755-2721/79/20241537\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Image classification plays a pivotal role in numerous applications, with substantial implications for daily life, including diagnosing disease from medical images and management of images in autonomous vehicles. However, such sort of research in this field continuously challenges scientists in terms of choosing datasets, testing accuracy, and improvement of models, etc. In this paper, we focus on the performance of two prominent models GoogleNet and residual attention network. We construct two models on the Python platform according to available online resources. To assess their capabilities, we employ the CIFAR-100 dataset, a widely used benchmark dataset. Despite the simplicity of our implementations, GoogleNet comprises approximately 75 convolutional layers and inception modules, and the Residual Attention Network incorporates multiple attention modules within its architecture. These characteristics demonstrate the models' potential for achieving exceptional classification results. Through comprehensive testing and visualization, we aim to provide insights into the efficacy of these models in the context of image classification. Our study contributes to a broader and profounder understanding of their suitability for real-world applications. According to our diagrams and analysis, we conclude that although attention56 is suitable to be adopted in image classification concerning its structure since the model is unstable and invalid in a wide range of training image data on dataset SIFAR100 it might not be exploited in practice. However, as to the model GoogleNet, with an increasing number of training, it obviously is prone to robustness and solid capability of noise resistance. Therefore, GoogleNet is a suitable one to be employed in image classification.\",\"PeriodicalId\":502253,\"journal\":{\"name\":\"Applied and Computational Engineering\",\"volume\":\"30 8\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied and Computational Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.54254/2755-2721/79/20241537\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied and Computational Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54254/2755-2721/79/20241537","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Comparative analysis of transformer and GoogleNet models in image classification based on the CIFAR dataset
Image classification plays a pivotal role in numerous applications, with substantial implications for daily life, including diagnosing disease from medical images and management of images in autonomous vehicles. However, such sort of research in this field continuously challenges scientists in terms of choosing datasets, testing accuracy, and improvement of models, etc. In this paper, we focus on the performance of two prominent models GoogleNet and residual attention network. We construct two models on the Python platform according to available online resources. To assess their capabilities, we employ the CIFAR-100 dataset, a widely used benchmark dataset. Despite the simplicity of our implementations, GoogleNet comprises approximately 75 convolutional layers and inception modules, and the Residual Attention Network incorporates multiple attention modules within its architecture. These characteristics demonstrate the models' potential for achieving exceptional classification results. Through comprehensive testing and visualization, we aim to provide insights into the efficacy of these models in the context of image classification. Our study contributes to a broader and profounder understanding of their suitability for real-world applications. According to our diagrams and analysis, we conclude that although attention56 is suitable to be adopted in image classification concerning its structure since the model is unstable and invalid in a wide range of training image data on dataset SIFAR100 it might not be exploited in practice. However, as to the model GoogleNet, with an increasing number of training, it obviously is prone to robustness and solid capability of noise resistance. Therefore, GoogleNet is a suitable one to be employed in image classification.