Automated Grapevine Inflorescence Counting in a Vineyard Using Deep Learning and Multi-object Tracking

2023 15th International Conference on Computer and Automation Engineering (ICCAE) Pub Date : 2023-03-03 DOI:10.1109/ICCAE56788.2023.10111243

Umme Fawzia Rahim, T. Utsumi, Yohei Iwaki, H. Mineno

{"title":"Automated Grapevine Inflorescence Counting in a Vineyard Using Deep Learning and Multi-object Tracking","authors":"Umme Fawzia Rahim, T. Utsumi, Yohei Iwaki, H. Mineno","doi":"10.1109/ICCAE56788.2023.10111243","DOIUrl":null,"url":null,"abstract":"To adjust management practices and improve wine marketing strategies, accurate vineyard yield estimation early in the growing season is essential. Conventional methods for yield forecasting rely on phenotypic features’ manual assessment, which is time- and labor-intensive and often destructive. We combined a deep object segmentation method, mask region-based convolutional neural network (Mask R-CNN), with two potential multi-object tracking algorithms, simple online and real-time tracking (SORT) and intersection-over-union (IOU) trackers to develop a complete visual system that can automatically detect and track individual inflorescences, enabling the assessment of the number of inflorescences per vineyard row from vineyard video footage. The performance of the two tracking algorithms was evaluated using our vineyard dataset, which is more challenging than conventional tracking benchmark datasets owing to environmental factors. Our evaluation dataset consists of videos of four vineyard rows, including 221 vines that were automatically acquired under unprepared field conditions. We tracked individual inflorescences across video image frames with a 92.1% multi-object tracking accuracy (MOTA) and an 89.6% identity F1 score (IDF1). This allowed us to estimate inflorescence count per vineyard row with a 0.91 coefficient of determination (R2) between the estimated count and manual-annotated ground truth count. The impact of leaf occlusions on inflorescence visibility was lessened by processing multiple successive image frames with minimal displacements to construct multiple camera views. This study demonstrates the use of deep learning and multi-object tracking in creating a low-cost (requiring only an RGB camera), high-throughput phenotyping system for precision viticulture.","PeriodicalId":406112,"journal":{"name":"2023 15th International Conference on Computer and Automation Engineering (ICCAE)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 15th International Conference on Computer and Automation Engineering (ICCAE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCAE56788.2023.10111243","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

To adjust management practices and improve wine marketing strategies, accurate vineyard yield estimation early in the growing season is essential. Conventional methods for yield forecasting rely on phenotypic features’ manual assessment, which is time- and labor-intensive and often destructive. We combined a deep object segmentation method, mask region-based convolutional neural network (Mask R-CNN), with two potential multi-object tracking algorithms, simple online and real-time tracking (SORT) and intersection-over-union (IOU) trackers to develop a complete visual system that can automatically detect and track individual inflorescences, enabling the assessment of the number of inflorescences per vineyard row from vineyard video footage. The performance of the two tracking algorithms was evaluated using our vineyard dataset, which is more challenging than conventional tracking benchmark datasets owing to environmental factors. Our evaluation dataset consists of videos of four vineyard rows, including 221 vines that were automatically acquired under unprepared field conditions. We tracked individual inflorescences across video image frames with a 92.1% multi-object tracking accuracy (MOTA) and an 89.6% identity F1 score (IDF1). This allowed us to estimate inflorescence count per vineyard row with a 0.91 coefficient of determination (R2) between the estimated count and manual-annotated ground truth count. The impact of leaf occlusions on inflorescence visibility was lessened by processing multiple successive image frames with minimal displacements to construct multiple camera views. This study demonstrates the use of deep learning and multi-object tracking in creating a low-cost (requiring only an RGB camera), high-throughput phenotyping system for precision viticulture.

查看原文本刊更多论文

使用深度学习和多目标跟踪在葡萄园中自动计数葡萄花序

为了调整管理实践和提高葡萄酒营销策略，在生长季节早期准确的葡萄园产量估计是必不可少的。传统的产量预测方法依赖于表型特征的人工评估，这是时间和劳动密集型的，而且往往是破坏性的。我们将深度目标分割方法、基于掩模区域的卷积神经网络(mask R-CNN)与两种潜在的多目标跟踪算法——简单在线和实时跟踪(SORT)和交叉-超合并(IOU)跟踪器——相结合，开发了一个完整的视觉系统，可以自动检测和跟踪单个花序，从而从葡萄园的视频片段中评估每行葡萄园的花序数量。使用我们的葡萄园数据集评估了这两种跟踪算法的性能，由于环境因素，这比传统的跟踪基准数据集更具挑战性。我们的评估数据集由四行葡萄园的视频组成，其中包括221株在未准备的田间条件下自动获取的葡萄藤。我们在视频图像帧中跟踪单个花序，多目标跟踪精度(MOTA)为92.1%，识别F1分数(IDF1)为89.6%。这使我们能够估计每行葡萄园的花序数，估计数与手册注释的地面真实数之间的决定系数(R2)为0.91。通过对多个连续图像帧进行最小位移处理，构建多个相机视图，降低叶片遮挡对花序可见性的影响。本研究展示了深度学习和多目标跟踪在创建低成本(只需要一个RGB相机)、高通量表型系统中的应用，用于精确的葡萄栽培。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 15th International Conference on Computer and Automation Engineering (ICCAE)

自引率

0.00%

发文量