Shuo Kang , Jiali Fan , Yongkai Ye , Chenglong Li , Dongdong Du , Jun Wang
{"title":"Maturity recognition and localisation of broccoli under occlusion based on RGB-D instance segmentation network","authors":"Shuo Kang , Jiali Fan , Yongkai Ye , Chenglong Li , Dongdong Du , Jun Wang","doi":"10.1016/j.biosystemseng.2025.01.007","DOIUrl":null,"url":null,"abstract":"<div><div>Selective harvesting robots for broccoli face significant challenges in field operations, where occlusions by leaves and stems, varying maturity stages and lighting interferences greatly affect performance. Addressing the need for a robust network capable of maturity recognition and localisation under various occlusion conditions for spherical crops, OccluInst—a single-stage instance segmentation network based on RGB-D and CNN-Transformer architecture was proposed. The solution is to make full use of visible information and crop characteristics. This model builds a dual-branch cross-modal calibration framework to generate instance-aware kernels and segmentation mask features. The proposed Attention Weight Interactive Fusion Module (AWIF) enhances the fusion efficiency of multi-scale RGB and depth features in complex scenarios, while the designed Adaptive Fusion Ratio Module (AFR) filters out noisy depth data and extracts valuable information to achieve feature alignment. Additionally, the developed Material Awareness Module (MA) highlights critical areas, improving feature extraction for irregular, multi-scale targets. The improved circular boundary anchor box accurately localises broccoli under various levels of occlusion. Ablation studies confirm the effectiveness of each module. OccluInst can swiftly and accurately identify the maturity categories and coordinates of broccoli under different occlusion levels. It achieves a mAP<sub>50</sub> of 86.2% and mAR of 83.5%, with an average centre point deviation of 3.68 pixels on images with a resolution of 848 × 480, and a detection speed of 51.4 frames per second, providing a robust visual foundation for selective harvesting robots.</div></div>","PeriodicalId":9173,"journal":{"name":"Biosystems Engineering","volume":"250 ","pages":"Pages 270-284"},"PeriodicalIF":4.4000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biosystems Engineering","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1537511025000078","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Selective harvesting robots for broccoli face significant challenges in field operations, where occlusions by leaves and stems, varying maturity stages and lighting interferences greatly affect performance. Addressing the need for a robust network capable of maturity recognition and localisation under various occlusion conditions for spherical crops, OccluInst—a single-stage instance segmentation network based on RGB-D and CNN-Transformer architecture was proposed. The solution is to make full use of visible information and crop characteristics. This model builds a dual-branch cross-modal calibration framework to generate instance-aware kernels and segmentation mask features. The proposed Attention Weight Interactive Fusion Module (AWIF) enhances the fusion efficiency of multi-scale RGB and depth features in complex scenarios, while the designed Adaptive Fusion Ratio Module (AFR) filters out noisy depth data and extracts valuable information to achieve feature alignment. Additionally, the developed Material Awareness Module (MA) highlights critical areas, improving feature extraction for irregular, multi-scale targets. The improved circular boundary anchor box accurately localises broccoli under various levels of occlusion. Ablation studies confirm the effectiveness of each module. OccluInst can swiftly and accurately identify the maturity categories and coordinates of broccoli under different occlusion levels. It achieves a mAP50 of 86.2% and mAR of 83.5%, with an average centre point deviation of 3.68 pixels on images with a resolution of 848 × 480, and a detection speed of 51.4 frames per second, providing a robust visual foundation for selective harvesting robots.
期刊介绍:
Biosystems Engineering publishes research in engineering and the physical sciences that represent advances in understanding or modelling of the performance of biological systems for sustainable developments in land use and the environment, agriculture and amenity, bioproduction processes and the food chain. The subject matter of the journal reflects the wide range and interdisciplinary nature of research in engineering for biological systems.