{"title":"Tomato Recognition Method Based on the YOLOv8-Tomato Model in Complex Greenhouse Environments","authors":"Shuhe Zheng, Xuexin Jia, Minglei He, Zebin Zheng, Tianliang Lin, Wuxiong Weng","doi":"10.3390/agronomy14081764","DOIUrl":null,"url":null,"abstract":"Tomatoes are a critical economic crop. The realization of tomato harvesting automation is of great significance in solving the labor shortage and improving the efficiency of the current harvesting operation. Accurate recognition of fruits is the key to realizing automated harvesting. Harvesting fruit at optimum ripeness ensures the highest nutrient content, flavor and market value levels, thus maximizing economic benefits. Owing to foliage and non-target fruits obstructing target fruits, as well as the alteration in color due to light, there is currently a low recognition rate and missed detection. We take the greenhouse tomato as the object of research. This paper proposes a tomato recognition model based on the improved YOLOv8 architecture to adapt to detecting tomato fruits in complex situations. First, to improve the model’s sensitivity to local features, we introduced an LSKA (Large Separable Kernel Attention) attention mechanism to aggregate feature information from different locations for better feature extraction. Secondly, to provide a higher quality upsampling effect, the ultra-lightweight and efficient dynamic upsampler Dysample (an ultra-lightweight and efficient dynamic upsampler) replaced the traditional nearest neighbor interpolation methods, which improves the overall performance of YOLOv8. Subsequently, the Inner-IoU function replaced the original CIoU loss function to hasten bounding box regression and raise model detection performance. Finally, the model test comparison was conducted on the self-built dataset, and the test results show that the mAP0.5 of the YOLOv8-Tomato model reached 99.4% and the recall rate reached 99.0%, which exceeds the original YOLOv8 model detection effect. Compared with faster R-CNN, SSD, YOLOv3-tiny, YOLOv5, and YOLOv8 models, the average accuracy is 7.5%, 11.6%, 8.6%, 3.3%, and 0.6% higher, respectively. This study demonstrates the model’s capacity to efficiently and accurately recognize tomatoes in unstructured growing environments, providing a technical reference for automated tomato harvesting.","PeriodicalId":7601,"journal":{"name":"Agronomy","volume":"41 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Agronomy","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/agronomy14081764","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Tomatoes are a critical economic crop. The realization of tomato harvesting automation is of great significance in solving the labor shortage and improving the efficiency of the current harvesting operation. Accurate recognition of fruits is the key to realizing automated harvesting. Harvesting fruit at optimum ripeness ensures the highest nutrient content, flavor and market value levels, thus maximizing economic benefits. Owing to foliage and non-target fruits obstructing target fruits, as well as the alteration in color due to light, there is currently a low recognition rate and missed detection. We take the greenhouse tomato as the object of research. This paper proposes a tomato recognition model based on the improved YOLOv8 architecture to adapt to detecting tomato fruits in complex situations. First, to improve the model’s sensitivity to local features, we introduced an LSKA (Large Separable Kernel Attention) attention mechanism to aggregate feature information from different locations for better feature extraction. Secondly, to provide a higher quality upsampling effect, the ultra-lightweight and efficient dynamic upsampler Dysample (an ultra-lightweight and efficient dynamic upsampler) replaced the traditional nearest neighbor interpolation methods, which improves the overall performance of YOLOv8. Subsequently, the Inner-IoU function replaced the original CIoU loss function to hasten bounding box regression and raise model detection performance. Finally, the model test comparison was conducted on the self-built dataset, and the test results show that the mAP0.5 of the YOLOv8-Tomato model reached 99.4% and the recall rate reached 99.0%, which exceeds the original YOLOv8 model detection effect. Compared with faster R-CNN, SSD, YOLOv3-tiny, YOLOv5, and YOLOv8 models, the average accuracy is 7.5%, 11.6%, 8.6%, 3.3%, and 0.6% higher, respectively. This study demonstrates the model’s capacity to efficiently and accurately recognize tomatoes in unstructured growing environments, providing a technical reference for automated tomato harvesting.