{"title":"基于优化YOLOv5s的茶芽检测研究","authors":"Guanli Li, Jianqiang Lu, Dong Zhang, Zhongyi Guo","doi":"10.1049/ipr2.13319","DOIUrl":null,"url":null,"abstract":"<p>As one of the world's most popular beverages, tea plays a significant role in improving tea production efficiency and quality through the identification of tea shoots during the tea manufacturing process. However, due to the complex morphology, small size, and susceptibility to factors like lighting and obstruction, traditional identification methods suffer from low accuracy and efficiency. In this study, image enhancement techniques such as HSV transformation, horizontal flipping, and vertical flipping were applied to the training dataset to improve model robustness and enhance generalization across varying lighting and angles. To address these challenges in the context of tea buds detection, deep-learning-based object detection methods have emerged as promising solutions. Nevertheless, current object detection technologies still face limitations when detecting tea buds under these conditions. To enhance identification performance, this article proposed an improved YOLOv5s (You Only Look Once version 5 small model) algorithm. In the improved YOLOv5s algorithm, CBAM, SE, and CA attention mechanisms were incorporated into the backbone network to augment feature extraction, and a weighted Bidirectional Feature Pyramid Network (BiFPN) is employed in the neck network to boost performance, resulting in the YOLOv5s_teabuds model. Experimental results indicated that the improved model significantly outperformed the original in terms of precision, recall, mAP and F1-score, with the CA attention mechanism providing the most notable improvement—enhancing precision, recall, mAP and F1-score by 18.119%, 9.633%, 16.496% and 13.524%, respectively. After integrating BiFPN, the YOLOv5s_teabuds model further strengthened performance and robustness, with precision, recall, mAP and F1-score increased by 19.346%, 11.388%, 18.620%, and 15.059%, respectively. Experimental results prove that the optimized YOLOv5s model can provide a real-time, high-precision tea buds detection method for robotic harvesting.</p>","PeriodicalId":56303,"journal":{"name":"IET Image Processing","volume":"19 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.13319","citationCount":"0","resultStr":"{\"title\":\"Research on tea buds detection based on optimized YOLOv5s\",\"authors\":\"Guanli Li, Jianqiang Lu, Dong Zhang, Zhongyi Guo\",\"doi\":\"10.1049/ipr2.13319\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>As one of the world's most popular beverages, tea plays a significant role in improving tea production efficiency and quality through the identification of tea shoots during the tea manufacturing process. However, due to the complex morphology, small size, and susceptibility to factors like lighting and obstruction, traditional identification methods suffer from low accuracy and efficiency. In this study, image enhancement techniques such as HSV transformation, horizontal flipping, and vertical flipping were applied to the training dataset to improve model robustness and enhance generalization across varying lighting and angles. To address these challenges in the context of tea buds detection, deep-learning-based object detection methods have emerged as promising solutions. Nevertheless, current object detection technologies still face limitations when detecting tea buds under these conditions. To enhance identification performance, this article proposed an improved YOLOv5s (You Only Look Once version 5 small model) algorithm. In the improved YOLOv5s algorithm, CBAM, SE, and CA attention mechanisms were incorporated into the backbone network to augment feature extraction, and a weighted Bidirectional Feature Pyramid Network (BiFPN) is employed in the neck network to boost performance, resulting in the YOLOv5s_teabuds model. Experimental results indicated that the improved model significantly outperformed the original in terms of precision, recall, mAP and F1-score, with the CA attention mechanism providing the most notable improvement—enhancing precision, recall, mAP and F1-score by 18.119%, 9.633%, 16.496% and 13.524%, respectively. After integrating BiFPN, the YOLOv5s_teabuds model further strengthened performance and robustness, with precision, recall, mAP and F1-score increased by 19.346%, 11.388%, 18.620%, and 15.059%, respectively. Experimental results prove that the optimized YOLOv5s model can provide a real-time, high-precision tea buds detection method for robotic harvesting.</p>\",\"PeriodicalId\":56303,\"journal\":{\"name\":\"IET Image Processing\",\"volume\":\"19 1\",\"pages\":\"\"},\"PeriodicalIF\":2.0000,\"publicationDate\":\"2025-02-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ipr2.13319\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IET Image Processing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.13319\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IET Image Processing","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/ipr2.13319","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
茶作为世界上最受欢迎的饮料之一,在茶叶生产过程中通过对茶叶苗的鉴定,对提高茶叶生产效率和质量起着重要的作用。然而,传统的识别方法由于形态复杂、体积小、易受光照和障碍物等因素的影响,准确性和效率较低。在本研究中,将HSV变换、水平翻转和垂直翻转等图像增强技术应用于训练数据集,以提高模型的鲁棒性,并增强不同光照和角度下的泛化。为了解决茶芽检测中的这些挑战,基于深度学习的对象检测方法已经成为有前途的解决方案。然而,目前的目标检测技术在这些条件下检测茶芽时仍然存在局限性。为了提高识别性能,本文提出了一种改进的YOLOv5s (You Only Look Once version 5 small model)算法。在改进的YOLOv5s算法中,在骨干网络中引入CBAM、SE和CA注意机制来增强特征提取,在颈部网络中采用加权双向特征金字塔网络(BiFPN)来提高性能,得到YOLOv5s_teabuds模型。实验结果表明,改进后的模型在查准率、查全率、mAP和f1得分上均显著优于原模型,其中CA注意机制的改进效果最为显著,查准率、查全率、mAP和f1得分分别提高了18.119%、9.633%、16.496%和13.524%。整合BiFPN后,YOLOv5s_teabuds模型进一步增强了性能和鲁棒性,准确率、召回率、mAP和f1得分分别提高了19.346%、11.388%、18.620%和15.059%。实验结果证明,优化后的YOLOv5s模型可以为机器人采收提供实时、高精度的茶芽检测方法。
Research on tea buds detection based on optimized YOLOv5s
As one of the world's most popular beverages, tea plays a significant role in improving tea production efficiency and quality through the identification of tea shoots during the tea manufacturing process. However, due to the complex morphology, small size, and susceptibility to factors like lighting and obstruction, traditional identification methods suffer from low accuracy and efficiency. In this study, image enhancement techniques such as HSV transformation, horizontal flipping, and vertical flipping were applied to the training dataset to improve model robustness and enhance generalization across varying lighting and angles. To address these challenges in the context of tea buds detection, deep-learning-based object detection methods have emerged as promising solutions. Nevertheless, current object detection technologies still face limitations when detecting tea buds under these conditions. To enhance identification performance, this article proposed an improved YOLOv5s (You Only Look Once version 5 small model) algorithm. In the improved YOLOv5s algorithm, CBAM, SE, and CA attention mechanisms were incorporated into the backbone network to augment feature extraction, and a weighted Bidirectional Feature Pyramid Network (BiFPN) is employed in the neck network to boost performance, resulting in the YOLOv5s_teabuds model. Experimental results indicated that the improved model significantly outperformed the original in terms of precision, recall, mAP and F1-score, with the CA attention mechanism providing the most notable improvement—enhancing precision, recall, mAP and F1-score by 18.119%, 9.633%, 16.496% and 13.524%, respectively. After integrating BiFPN, the YOLOv5s_teabuds model further strengthened performance and robustness, with precision, recall, mAP and F1-score increased by 19.346%, 11.388%, 18.620%, and 15.059%, respectively. Experimental results prove that the optimized YOLOv5s model can provide a real-time, high-precision tea buds detection method for robotic harvesting.
期刊介绍:
The IET Image Processing journal encompasses research areas related to the generation, processing and communication of visual information. The focus of the journal is the coverage of the latest research results in image and video processing, including image generation and display, enhancement and restoration, segmentation, colour and texture analysis, coding and communication, implementations and architectures as well as innovative applications.
Principal topics include:
Generation and Display - Imaging sensors and acquisition systems, illumination, sampling and scanning, quantization, colour reproduction, image rendering, display and printing systems, evaluation of image quality.
Processing and Analysis - Image enhancement, restoration, segmentation, registration, multispectral, colour and texture processing, multiresolution processing and wavelets, morphological operations, stereoscopic and 3-D processing, motion detection and estimation, video and image sequence processing.
Implementations and Architectures - Image and video processing hardware and software, design and construction, architectures and software, neural, adaptive, and fuzzy processing.
Coding and Transmission - Image and video compression and coding, compression standards, noise modelling, visual information networks, streamed video.
Retrieval and Multimedia - Storage of images and video, database design, image retrieval, video annotation and editing, mixed media incorporating visual information, multimedia systems and applications, image and video watermarking, steganography.
Applications - Innovative application of image and video processing technologies to any field, including life sciences, earth sciences, astronomy, document processing and security.
Current Special Issue Call for Papers:
Evolutionary Computation for Image Processing - https://digital-library.theiet.org/files/IET_IPR_CFP_EC.pdf
AI-Powered 3D Vision - https://digital-library.theiet.org/files/IET_IPR_CFP_AIPV.pdf
Multidisciplinary advancement of Imaging Technologies: From Medical Diagnostics and Genomics to Cognitive Machine Vision, and Artificial Intelligence - https://digital-library.theiet.org/files/IET_IPR_CFP_IST.pdf
Deep Learning for 3D Reconstruction - https://digital-library.theiet.org/files/IET_IPR_CFP_DLR.pdf