{"title":"利用 AD-YOLO 和 MR-SORT 进行苹果自动检测和计数。","authors":"Xueliang Yang, Yapeng Gao, Mengyu Yin, Haifang Li","doi":"10.3390/s24217012","DOIUrl":null,"url":null,"abstract":"<p><p>In the production management of agriculture, accurate fruit counting plays a vital role in the orchard yield estimation and appropriate production decisions. Although recent tracking-by-detection algorithms have emerged as a promising fruit-counting method, they still cannot completely avoid fruit occlusion and light variations in complex orchard environments, and it is difficult to realize automatic and accurate apple counting. In this paper, a video-based multiple-object tracking method, MR-SORT (Multiple Rematching SORT), is proposed based on the improved YOLOv8 and BoT-SORT. First, we propose the AD-YOLO model, which aims to reduce the number of incorrect detections during object tracking. In the YOLOv8s backbone network, an Omni-dimensional Dynamic Convolution (ODConv) module is used to extract local feature information and enhance the model's ability better; a Global Attention Mechanism (GAM) is introduced to improve the detection ability of a foreground object (apple) in the whole image; a Soft Spatial Pyramid Pooling Layer (SSPPL) is designed to reduce the feature information dispersion and increase the sensory field of the network. Then, the improved BoT-SORT algorithm is proposed by fusing the verification mechanism, SURF feature descriptors, and the Vector of Local Aggregate Descriptors (VLAD) algorithm, which can match apples more accurately in adjacent video frames and reduce the probability of ID switching in the tracking process. The results show that the mAP metrics of the proposed AD-YOLO model are 3.1% higher than those of the YOLOv8 model, reaching 96.4%. The improved tracking algorithm has 297 fewer ID switches, which is 35.6% less than the original algorithm. The multiple-object tracking accuracy of the improved algorithm reached 85.6%, and the average counting error was reduced to 0.07. The coefficient of determination R2 between the ground truth and the predicted value reached 0.98. The above metrics show that our method can give more accurate counting results for apples and even other types of fruit.</p>","PeriodicalId":21698,"journal":{"name":"Sensors","volume":"24 21","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2024-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11548465/pdf/","citationCount":"0","resultStr":"{\"title\":\"Automatic Apple Detection and Counting with AD-YOLO and MR-SORT.\",\"authors\":\"Xueliang Yang, Yapeng Gao, Mengyu Yin, Haifang Li\",\"doi\":\"10.3390/s24217012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>In the production management of agriculture, accurate fruit counting plays a vital role in the orchard yield estimation and appropriate production decisions. Although recent tracking-by-detection algorithms have emerged as a promising fruit-counting method, they still cannot completely avoid fruit occlusion and light variations in complex orchard environments, and it is difficult to realize automatic and accurate apple counting. In this paper, a video-based multiple-object tracking method, MR-SORT (Multiple Rematching SORT), is proposed based on the improved YOLOv8 and BoT-SORT. First, we propose the AD-YOLO model, which aims to reduce the number of incorrect detections during object tracking. In the YOLOv8s backbone network, an Omni-dimensional Dynamic Convolution (ODConv) module is used to extract local feature information and enhance the model's ability better; a Global Attention Mechanism (GAM) is introduced to improve the detection ability of a foreground object (apple) in the whole image; a Soft Spatial Pyramid Pooling Layer (SSPPL) is designed to reduce the feature information dispersion and increase the sensory field of the network. Then, the improved BoT-SORT algorithm is proposed by fusing the verification mechanism, SURF feature descriptors, and the Vector of Local Aggregate Descriptors (VLAD) algorithm, which can match apples more accurately in adjacent video frames and reduce the probability of ID switching in the tracking process. The results show that the mAP metrics of the proposed AD-YOLO model are 3.1% higher than those of the YOLOv8 model, reaching 96.4%. The improved tracking algorithm has 297 fewer ID switches, which is 35.6% less than the original algorithm. The multiple-object tracking accuracy of the improved algorithm reached 85.6%, and the average counting error was reduced to 0.07. The coefficient of determination R2 between the ground truth and the predicted value reached 0.98. The above metrics show that our method can give more accurate counting results for apples and even other types of fruit.</p>\",\"PeriodicalId\":21698,\"journal\":{\"name\":\"Sensors\",\"volume\":\"24 21\",\"pages\":\"\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-10-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11548465/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sensors\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.3390/s24217012\",\"RegionNum\":3,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, ANALYTICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sensors","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.3390/s24217012","RegionNum":3,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, ANALYTICAL","Score":null,"Total":0}
Automatic Apple Detection and Counting with AD-YOLO and MR-SORT.
In the production management of agriculture, accurate fruit counting plays a vital role in the orchard yield estimation and appropriate production decisions. Although recent tracking-by-detection algorithms have emerged as a promising fruit-counting method, they still cannot completely avoid fruit occlusion and light variations in complex orchard environments, and it is difficult to realize automatic and accurate apple counting. In this paper, a video-based multiple-object tracking method, MR-SORT (Multiple Rematching SORT), is proposed based on the improved YOLOv8 and BoT-SORT. First, we propose the AD-YOLO model, which aims to reduce the number of incorrect detections during object tracking. In the YOLOv8s backbone network, an Omni-dimensional Dynamic Convolution (ODConv) module is used to extract local feature information and enhance the model's ability better; a Global Attention Mechanism (GAM) is introduced to improve the detection ability of a foreground object (apple) in the whole image; a Soft Spatial Pyramid Pooling Layer (SSPPL) is designed to reduce the feature information dispersion and increase the sensory field of the network. Then, the improved BoT-SORT algorithm is proposed by fusing the verification mechanism, SURF feature descriptors, and the Vector of Local Aggregate Descriptors (VLAD) algorithm, which can match apples more accurately in adjacent video frames and reduce the probability of ID switching in the tracking process. The results show that the mAP metrics of the proposed AD-YOLO model are 3.1% higher than those of the YOLOv8 model, reaching 96.4%. The improved tracking algorithm has 297 fewer ID switches, which is 35.6% less than the original algorithm. The multiple-object tracking accuracy of the improved algorithm reached 85.6%, and the average counting error was reduced to 0.07. The coefficient of determination R2 between the ground truth and the predicted value reached 0.98. The above metrics show that our method can give more accurate counting results for apples and even other types of fruit.
期刊介绍:
Sensors (ISSN 1424-8220) provides an advanced forum for the science and technology of sensors and biosensors. It publishes reviews (including comprehensive reviews on the complete sensors products), regular research papers and short notes. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced.