基于多头注意机制和GLCM的机器视觉和深度学习的苹果品种识别

IF 3.3 3区农林科学 Q2 FOOD SCIENCE & TECHNOLOGY

Journal of Food Measurement and Characterization Pub Date : 2025-06-16 DOI:10.1007/s11694-025-03385-5

Zhiming Guo, Haidi Xiao, Zhiqiang Dai, Chen Wang, Chanjun Sun, Nicholas Watson, Megan Povey, Xiaobo Zou

{"title":"基于多头注意机制和GLCM的机器视觉和深度学习的苹果品种识别","authors":"Zhiming Guo, Haidi Xiao, Zhiqiang Dai, Chen Wang, Chanjun Sun, Nicholas Watson, Megan Povey, Xiaobo Zou","doi":"10.1007/s11694-025-03385-5","DOIUrl":null,"url":null,"abstract":"<div><p>Apple variety identification plays a crucial role in pomology and agricultural sciences, as it could effectively assist growers in optimizing orchard management, enhancing product quality, and meeting consumer demand. Traditional identification methods based on visual observation are often influenced by various factors, including human subjective judgment and inter-cultivar variability. To address these challenges, with the support of the China Agriculture Research Systems for Apple Industry and Jiangsu University, we collected sample images of eleven common apple varieties in China, followed by image enhancement and dataset expansion to establish an apple sample database. Subsequently, Convolutional Neural Network (CNN), MobileNet Version 2 (MobileNetV2), and Visual Geometry Group 19 (VGG19) neural network models were utilized for apple variety classification using image-based data. Additionally, two optimization techniques, namely Multi-Head Attention and Gray-Level Co-occurrence Matrix (GLCM), were incorporated to further improve classification accuracy. Results demonstrated that the baseline CNN achieved an accuracy of 96.46%, while MobileNetV2 and VGG19 reached 97.78% and 97.25%, respectively. Multi-Head Attention improved feature extraction but sometimes reduced performance, as observed in MobileNetV2 (87.33%). In contrast, GLCM significantly improved model accuracy, with MobileNetV2 achieving the highest accuracy (98.25%) and the lowest Mean Absolute Error (MAE) (0.0571). GLCM consistently outperformed other techniques across all models, proving particularly effective for texture-rich image classification. These findings suggest that GLCM is a powerful enhancement for deep learning models, improving accuracy, precision, and recall in apple variety classification, with MobileNetV2 combined with GLCM yielding the best overall results.</p></div>","PeriodicalId":631,"journal":{"name":"Journal of Food Measurement and Characterization","volume":"19 9","pages":"6540 - 6558"},"PeriodicalIF":3.3000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Identification of apple variety using machine vision and deep learning with Multi-Head Attention mechanism and GLCM\",\"authors\":\"Zhiming Guo, Haidi Xiao, Zhiqiang Dai, Chen Wang, Chanjun Sun, Nicholas Watson, Megan Povey, Xiaobo Zou\",\"doi\":\"10.1007/s11694-025-03385-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Apple variety identification plays a crucial role in pomology and agricultural sciences, as it could effectively assist growers in optimizing orchard management, enhancing product quality, and meeting consumer demand. Traditional identification methods based on visual observation are often influenced by various factors, including human subjective judgment and inter-cultivar variability. To address these challenges, with the support of the China Agriculture Research Systems for Apple Industry and Jiangsu University, we collected sample images of eleven common apple varieties in China, followed by image enhancement and dataset expansion to establish an apple sample database. Subsequently, Convolutional Neural Network (CNN), MobileNet Version 2 (MobileNetV2), and Visual Geometry Group 19 (VGG19) neural network models were utilized for apple variety classification using image-based data. Additionally, two optimization techniques, namely Multi-Head Attention and Gray-Level Co-occurrence Matrix (GLCM), were incorporated to further improve classification accuracy. Results demonstrated that the baseline CNN achieved an accuracy of 96.46%, while MobileNetV2 and VGG19 reached 97.78% and 97.25%, respectively. Multi-Head Attention improved feature extraction but sometimes reduced performance, as observed in MobileNetV2 (87.33%). In contrast, GLCM significantly improved model accuracy, with MobileNetV2 achieving the highest accuracy (98.25%) and the lowest Mean Absolute Error (MAE) (0.0571). GLCM consistently outperformed other techniques across all models, proving particularly effective for texture-rich image classification. These findings suggest that GLCM is a powerful enhancement for deep learning models, improving accuracy, precision, and recall in apple variety classification, with MobileNetV2 combined with GLCM yielding the best overall results.</p></div>\",\"PeriodicalId\":631,\"journal\":{\"name\":\"Journal of Food Measurement and Characterization\",\"volume\":\"19 9\",\"pages\":\"6540 - 6558\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Food Measurement and Characterization\",\"FirstCategoryId\":\"97\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s11694-025-03385-5\",\"RegionNum\":3,\"RegionCategory\":\"农林科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"FOOD SCIENCE & TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Food Measurement and Characterization","FirstCategoryId":"97","ListUrlMain":"https://link.springer.com/article/10.1007/s11694-025-03385-5","RegionNum":3,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"FOOD SCIENCE & TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

苹果品种鉴定在果树学和农业科学中具有重要作用，可以有效地帮助种植者优化果园管理，提高产品质量，满足消费者需求。传统的目视鉴定方法往往受到人为主观判断和品种间变异等因素的影响。为了解决这些问题，在中国苹果产业农业研究系统和江苏大学的支持下，我们收集了中国11个常见苹果品种的样本图像，然后进行图像增强和数据扩展，建立了苹果样本数据库。随后，利用卷积神经网络（CNN）、MobileNet Version 2 （MobileNetV2）和Visual Geometry Group 19 （VGG19）神经网络模型对基于图像数据的苹果品种进行分类。此外，还结合了多头注意和灰度共生矩阵（GLCM）两种优化技术，进一步提高了分类精度。结果表明，基线CNN的准确率为96.46%，而MobileNetV2和VGG19分别达到97.78%和97.25%。多头注意力改善了特征提取，但有时会降低性能，正如在MobileNetV2中观察到的那样（87.33%）。相比之下，GLCM显著提高了模型精度，其中MobileNetV2的准确率最高（98.25%），平均绝对误差（MAE）最低（0.0571）。在所有模型中，GLCM始终优于其他技术，对纹理丰富的图像分类特别有效。这些发现表明，GLCM是深度学习模型的强大增强，可以提高苹果品种分类的准确性、精密度和召回率，其中MobileNetV2与GLCM相结合可以产生最佳的总体结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Identification of apple variety using machine vision and deep learning with Multi-Head Attention mechanism and GLCM

Apple variety identification plays a crucial role in pomology and agricultural sciences, as it could effectively assist growers in optimizing orchard management, enhancing product quality, and meeting consumer demand. Traditional identification methods based on visual observation are often influenced by various factors, including human subjective judgment and inter-cultivar variability. To address these challenges, with the support of the China Agriculture Research Systems for Apple Industry and Jiangsu University, we collected sample images of eleven common apple varieties in China, followed by image enhancement and dataset expansion to establish an apple sample database. Subsequently, Convolutional Neural Network (CNN), MobileNet Version 2 (MobileNetV2), and Visual Geometry Group 19 (VGG19) neural network models were utilized for apple variety classification using image-based data. Additionally, two optimization techniques, namely Multi-Head Attention and Gray-Level Co-occurrence Matrix (GLCM), were incorporated to further improve classification accuracy. Results demonstrated that the baseline CNN achieved an accuracy of 96.46%, while MobileNetV2 and VGG19 reached 97.78% and 97.25%, respectively. Multi-Head Attention improved feature extraction but sometimes reduced performance, as observed in MobileNetV2 (87.33%). In contrast, GLCM significantly improved model accuracy, with MobileNetV2 achieving the highest accuracy (98.25%) and the lowest Mean Absolute Error (MAE) (0.0571). GLCM consistently outperformed other techniques across all models, proving particularly effective for texture-rich image classification. These findings suggest that GLCM is a powerful enhancement for deep learning models, improving accuracy, precision, and recall in apple variety classification, with MobileNetV2 combined with GLCM yielding the best overall results.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Food Measurement and Characterization Agricultural and Biological Sciences-Food Science

CiteScore

6.00

自引率

11.80%

发文量

425

期刊介绍： This interdisciplinary journal publishes new measurement results, characteristic properties, differentiating patterns, measurement methods and procedures for such purposes as food process innovation, product development, quality control, and safety assurance. The journal encompasses all topics related to food property measurement and characterization, including all types of measured properties of food and food materials, features and patterns, measurement principles and techniques, development and evaluation of technologies, novel uses and applications, and industrial implementation of systems and procedures.