Fu Zhang, Zijun Chen, Shaukat Ali, Ning Yang, Sanling Fu, Yakun Zhang
{"title":"Multi-class detection of cherry tomatoes using improved YOLOv4-Tiny","authors":"Fu Zhang, Zijun Chen, Shaukat Ali, Ning Yang, Sanling Fu, Yakun Zhang","doi":"10.25165/j.ijabe.20231602.7744","DOIUrl":null,"url":null,"abstract":": The rapid and accurate detection of cherry tomatoes is of great significance to realizing automatic picking by robots. However, so far, cherry tomatoes are detected as only one class for picking. Fruits occluded by branches or leaves are detected as pickable objects, which may cause damage to the plant or robot end-effector during picking. This study proposed the Feature Enhancement Network Block (FENB) based on YOLOv4-Tiny to solve the above problem. Firstly, according to the distribution characteristics and picking strategies of cherry tomatoes, cherry tomatoes were divided into four classes in the nighttime, and daytime included not occluded, occluded by branches, occluded by fruits, and occluded by leaves. Secondly, the CSPNet structure with the hybrid attention mechanism was used to design the FENB, which pays more attention to the effective features of different classes of cherry tomatoes while retaining the original features. Finally, the Feature Enhancement Network (FEN) was constructed based on the FENB to enhance the feature extraction ability and improve the detection accuracy of YOLOv4-Tiny. The experimental results show that under the confidence of 0.5, average precision (AP) of non-occluded, branch-occluded, fruit-occluded, and leaf-occluded fruit over the day test images were 95.86%, 92.59%, 89.66%, and 84.99%, respectively, which were 98.43%, 95.62%, 95.50%, and 89.33% on the night test images, respectively. The mean Average Precision (mAP) of four classes over the night test set was higher (94.72%) than that of the day (90.78%), which were both better than YOLOv4 and YOLOv4-Tiny. It cost 32.22 ms to process a 416×416 image on the GPU. The model size was 39.34 MB. Therefore, the proposed model can provide a practical and feasible method for the multi-class detection of cherry tomatoes.","PeriodicalId":13895,"journal":{"name":"International Journal of Agricultural and Biological Engineering","volume":"5 4","pages":""},"PeriodicalIF":2.2000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Agricultural and Biological Engineering","FirstCategoryId":"97","ListUrlMain":"https://doi.org/10.25165/j.ijabe.20231602.7744","RegionNum":2,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 2
Abstract
: The rapid and accurate detection of cherry tomatoes is of great significance to realizing automatic picking by robots. However, so far, cherry tomatoes are detected as only one class for picking. Fruits occluded by branches or leaves are detected as pickable objects, which may cause damage to the plant or robot end-effector during picking. This study proposed the Feature Enhancement Network Block (FENB) based on YOLOv4-Tiny to solve the above problem. Firstly, according to the distribution characteristics and picking strategies of cherry tomatoes, cherry tomatoes were divided into four classes in the nighttime, and daytime included not occluded, occluded by branches, occluded by fruits, and occluded by leaves. Secondly, the CSPNet structure with the hybrid attention mechanism was used to design the FENB, which pays more attention to the effective features of different classes of cherry tomatoes while retaining the original features. Finally, the Feature Enhancement Network (FEN) was constructed based on the FENB to enhance the feature extraction ability and improve the detection accuracy of YOLOv4-Tiny. The experimental results show that under the confidence of 0.5, average precision (AP) of non-occluded, branch-occluded, fruit-occluded, and leaf-occluded fruit over the day test images were 95.86%, 92.59%, 89.66%, and 84.99%, respectively, which were 98.43%, 95.62%, 95.50%, and 89.33% on the night test images, respectively. The mean Average Precision (mAP) of four classes over the night test set was higher (94.72%) than that of the day (90.78%), which were both better than YOLOv4 and YOLOv4-Tiny. It cost 32.22 ms to process a 416×416 image on the GPU. The model size was 39.34 MB. Therefore, the proposed model can provide a practical and feasible method for the multi-class detection of cherry tomatoes.
期刊介绍:
International Journal of Agricultural and Biological Engineering (IJABE, https://www.ijabe.org) is a peer reviewed open access international journal. IJABE, started in 2008, is a joint publication co-sponsored by US-based Association of Agricultural, Biological and Food Engineers (AOCABFE) and China-based Chinese Society of Agricultural Engineering (CSAE). The ISSN 1934-6344 and eISSN 1934-6352 numbers for both print and online IJABE have been registered in US. Now, Int. J. Agric. & Biol. Eng (IJABE) is published in both online and print version by Chinese Academy of Agricultural Engineering.