Preety Baglat , Ahatsham Hayat , Sheikh Shanawaz Mostafa , Fábio Mendonça , Fernando Morgado-Dias
{"title":"YOLO世代香蕉束检测的比较分析与评价","authors":"Preety Baglat , Ahatsham Hayat , Sheikh Shanawaz Mostafa , Fábio Mendonça , Fernando Morgado-Dias","doi":"10.1016/j.atech.2025.101100","DOIUrl":null,"url":null,"abstract":"<div><div>This study focuses on improving the automation of banana harvesting decisions for farmers with artificial intelligence assistance. Traditionally, experienced harvesters manually inspect fields to determine the optimal harvesting time, a process that is both labor-intensive and increasingly unsustainable due to a shortage of skilled workers. To address this challenge, this work proposes a computer vision-based approach for detecting banana bunches in images captured by mobile phones, as a preliminary step towards a comprehensive harvesting decision pipeline. To achieve this, a dataset was collected with 2179 photos of multiple Cavendish banana bunches in different light and exposure conditions, and a comparative analysis of You Only Look Once (YOLO) object detection models was conducted, from version 1 to 12, to identify the most accurate and efficient solution for banana bunch detection, ensuring compatibility with mobile-based applications. Among all models evaluated, YOLOv12n achieved the most balanced performance on five-fold cross-validation, with 93 % Average Precision (AP<sup>50test</sup>), 51 % AP<sup>50–95test</sup>, and 5.1 ms latency, making it well-suited for real-time deployment on resource-constrained edge devices.</div></div>","PeriodicalId":74813,"journal":{"name":"Smart agricultural technology","volume":"12 ","pages":"Article 101100"},"PeriodicalIF":5.7000,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comparative analysis and evaluation of YOLO generations for banana bunch detection\",\"authors\":\"Preety Baglat , Ahatsham Hayat , Sheikh Shanawaz Mostafa , Fábio Mendonça , Fernando Morgado-Dias\",\"doi\":\"10.1016/j.atech.2025.101100\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This study focuses on improving the automation of banana harvesting decisions for farmers with artificial intelligence assistance. Traditionally, experienced harvesters manually inspect fields to determine the optimal harvesting time, a process that is both labor-intensive and increasingly unsustainable due to a shortage of skilled workers. To address this challenge, this work proposes a computer vision-based approach for detecting banana bunches in images captured by mobile phones, as a preliminary step towards a comprehensive harvesting decision pipeline. To achieve this, a dataset was collected with 2179 photos of multiple Cavendish banana bunches in different light and exposure conditions, and a comparative analysis of You Only Look Once (YOLO) object detection models was conducted, from version 1 to 12, to identify the most accurate and efficient solution for banana bunch detection, ensuring compatibility with mobile-based applications. Among all models evaluated, YOLOv12n achieved the most balanced performance on five-fold cross-validation, with 93 % Average Precision (AP<sup>50test</sup>), 51 % AP<sup>50–95test</sup>, and 5.1 ms latency, making it well-suited for real-time deployment on resource-constrained edge devices.</div></div>\",\"PeriodicalId\":74813,\"journal\":{\"name\":\"Smart agricultural technology\",\"volume\":\"12 \",\"pages\":\"Article 101100\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2025-06-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Smart agricultural technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772375525003338\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AGRICULTURAL ENGINEERING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Smart agricultural technology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772375525003338","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0
摘要
本研究的重点是在人工智能的帮助下提高农民香蕉收获决策的自动化程度。传统上,经验丰富的采收人员手动检查田地以确定最佳采收时间,这一过程既劳动密集型,又由于缺乏熟练工人而越来越不可持续。为了解决这一挑战,这项工作提出了一种基于计算机视觉的方法,用于检测手机捕获的图像中的香蕉束,作为全面收获决策管道的初步步骤。为此,研究人员收集了2179张不同光照和曝光条件下卡文迪什香蕉束的照片,并对YOLO (You Only Look Once)目标检测模型从版本1到版本12进行了对比分析,以确定最准确、最有效的香蕉束检测解决方案,确保与基于移动的应用程序兼容。在所有评估的模型中,YOLOv12n在五次交叉验证中实现了最平衡的性能,平均精度为93% (ap50 - test), AP50-95test为51%,延迟为5.1 ms,非常适合在资源受限的边缘设备上进行实时部署。
Comparative analysis and evaluation of YOLO generations for banana bunch detection
This study focuses on improving the automation of banana harvesting decisions for farmers with artificial intelligence assistance. Traditionally, experienced harvesters manually inspect fields to determine the optimal harvesting time, a process that is both labor-intensive and increasingly unsustainable due to a shortage of skilled workers. To address this challenge, this work proposes a computer vision-based approach for detecting banana bunches in images captured by mobile phones, as a preliminary step towards a comprehensive harvesting decision pipeline. To achieve this, a dataset was collected with 2179 photos of multiple Cavendish banana bunches in different light and exposure conditions, and a comparative analysis of You Only Look Once (YOLO) object detection models was conducted, from version 1 to 12, to identify the most accurate and efficient solution for banana bunch detection, ensuring compatibility with mobile-based applications. Among all models evaluated, YOLOv12n achieved the most balanced performance on five-fold cross-validation, with 93 % Average Precision (AP50test), 51 % AP50–95test, and 5.1 ms latency, making it well-suited for real-time deployment on resource-constrained edge devices.