基于 CIFAR 数据集的变换器和 GoogleNet 模型在图像分类中的对比分析

Applied and Computational Engineering Pub Date : 2024-07-25 DOI:10.54254/2755-2721/79/20241537

Xinran Xie, Qinwen Yan, Haoye Li, Sujie Yan, Zirong Jiang

{"title":"基于 CIFAR 数据集的变换器和 GoogleNet 模型在图像分类中的对比分析","authors":"Xinran Xie, Qinwen Yan, Haoye Li, Sujie Yan, Zirong Jiang","doi":"10.54254/2755-2721/79/20241537","DOIUrl":null,"url":null,"abstract":"Image classification plays a pivotal role in numerous applications, with substantial implications for daily life, including diagnosing disease from medical images and management of images in autonomous vehicles. However, such sort of research in this field continuously challenges scientists in terms of choosing datasets, testing accuracy, and improvement of models, etc. In this paper, we focus on the performance of two prominent models GoogleNet and residual attention network. We construct two models on the Python platform according to available online resources. To assess their capabilities, we employ the CIFAR-100 dataset, a widely used benchmark dataset. Despite the simplicity of our implementations, GoogleNet comprises approximately 75 convolutional layers and inception modules, and the Residual Attention Network incorporates multiple attention modules within its architecture. These characteristics demonstrate the models' potential for achieving exceptional classification results. Through comprehensive testing and visualization, we aim to provide insights into the efficacy of these models in the context of image classification. Our study contributes to a broader and profounder understanding of their suitability for real-world applications. According to our diagrams and analysis, we conclude that although attention56 is suitable to be adopted in image classification concerning its structure since the model is unstable and invalid in a wide range of training image data on dataset SIFAR100 it might not be exploited in practice. However, as to the model GoogleNet, with an increasing number of training, it obviously is prone to robustness and solid capability of noise resistance. Therefore, GoogleNet is a suitable one to be employed in image classification.","PeriodicalId":502253,"journal":{"name":"Applied and Computational Engineering","volume":"30 8","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparative analysis of transformer and GoogleNet models in image classification based on the CIFAR dataset\",\"authors\":\"Xinran Xie, Qinwen Yan, Haoye Li, Sujie Yan, Zirong Jiang\",\"doi\":\"10.54254/2755-2721/79/20241537\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Image classification plays a pivotal role in numerous applications, with substantial implications for daily life, including diagnosing disease from medical images and management of images in autonomous vehicles. However, such sort of research in this field continuously challenges scientists in terms of choosing datasets, testing accuracy, and improvement of models, etc. In this paper, we focus on the performance of two prominent models GoogleNet and residual attention network. We construct two models on the Python platform according to available online resources. To assess their capabilities, we employ the CIFAR-100 dataset, a widely used benchmark dataset. Despite the simplicity of our implementations, GoogleNet comprises approximately 75 convolutional layers and inception modules, and the Residual Attention Network incorporates multiple attention modules within its architecture. These characteristics demonstrate the models' potential for achieving exceptional classification results. Through comprehensive testing and visualization, we aim to provide insights into the efficacy of these models in the context of image classification. Our study contributes to a broader and profounder understanding of their suitability for real-world applications. According to our diagrams and analysis, we conclude that although attention56 is suitable to be adopted in image classification concerning its structure since the model is unstable and invalid in a wide range of training image data on dataset SIFAR100 it might not be exploited in practice. However, as to the model GoogleNet, with an increasing number of training, it obviously is prone to robustness and solid capability of noise resistance. Therefore, GoogleNet is a suitable one to be employed in image classification.\",\"PeriodicalId\":502253,\"journal\":{\"name\":\"Applied and Computational Engineering\",\"volume\":\"30 8\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied and Computational Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.54254/2755-2721/79/20241537\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied and Computational Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54254/2755-2721/79/20241537","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

图像分类在众多应用中发挥着举足轻重的作用，对日常生活有着重大影响，包括从医学图像中诊断疾病和自动驾驶汽车中的图像管理。然而，该领域的此类研究在数据集选择、准确性测试和模型改进等方面不断向科学家提出挑战。在本文中，我们重点研究了两个著名模型 GoogleNet 和剩余注意力网络的性能。我们根据可用的在线资源，在 Python 平台上构建了两个模型。为了评估它们的能力，我们使用了 CIFAR-100 数据集，这是一个广泛使用的基准数据集。尽管我们的实现比较简单，但 GoogleNet 包含约 75 个卷积层和入门模块，而残差注意网络的架构中包含多个注意模块。这些特点表明，这些模型具有取得优异分类结果的潜力。通过综合测试和可视化，我们希望深入了解这些模型在图像分类方面的功效。我们的研究有助于更广泛、更深入地了解这些模型在实际应用中的适用性。根据我们的图表和分析，我们得出结论：尽管 attention56 适合用于图像分类，但由于该模型在数据集 SIFAR100 上的大量训练图像数据中不稳定且无效，因此其结构在实践中可能无法利用。然而，对于 GoogleNet 模型来说，随着训练次数的增加，它的鲁棒性和抗噪声能力明显增强。因此，GoogleNet 是一个适合用于图像分类的模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Comparative analysis of transformer and GoogleNet models in image classification based on the CIFAR dataset

Image classification plays a pivotal role in numerous applications, with substantial implications for daily life, including diagnosing disease from medical images and management of images in autonomous vehicles. However, such sort of research in this field continuously challenges scientists in terms of choosing datasets, testing accuracy, and improvement of models, etc. In this paper, we focus on the performance of two prominent models GoogleNet and residual attention network. We construct two models on the Python platform according to available online resources. To assess their capabilities, we employ the CIFAR-100 dataset, a widely used benchmark dataset. Despite the simplicity of our implementations, GoogleNet comprises approximately 75 convolutional layers and inception modules, and the Residual Attention Network incorporates multiple attention modules within its architecture. These characteristics demonstrate the models' potential for achieving exceptional classification results. Through comprehensive testing and visualization, we aim to provide insights into the efficacy of these models in the context of image classification. Our study contributes to a broader and profounder understanding of their suitability for real-world applications. According to our diagrams and analysis, we conclude that although attention56 is suitable to be adopted in image classification concerning its structure since the model is unstable and invalid in a wide range of training image data on dataset SIFAR100 it might not be exploited in practice. However, as to the model GoogleNet, with an increasing number of training, it obviously is prone to robustness and solid capability of noise resistance. Therefore, GoogleNet is a suitable one to be employed in image classification.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied and Computational Engineering

自引率

0.00%

发文量