Bart M. van Marrewijk , Tim van Daalen , Bolai Xin , Eldert J. van Henten , Gerrit Polder , Gert Kootstra
{"title":"3D plant segmentation: Comparing a 2D-to-3D segmentation method with state-of-the-art 3D segmentation algorithms","authors":"Bart M. van Marrewijk , Tim van Daalen , Bolai Xin , Eldert J. van Henten , Gerrit Polder , Gert Kootstra","doi":"10.1016/j.biosystemseng.2025.104147","DOIUrl":null,"url":null,"abstract":"<div><div>Plant measurements are crucial to determine which plants grow optimal under certain conditions. These measurements can be done by hand, or automated using cameras, also known as image-based plant phenotyping. These images can be used to create point clouds to measure plant traits in 3D. To extract plant traits, accurate segmentation is crucial. Most point cloud segmentation methods rely on 3D segmentation algorithms. These algorithms are not as advanced and developed as 2D algorithms. In addition, 2D neural networks are pre-trained on large diverse datasets. In our work, it was therefore hypothesised that segmentation of point clouds using projection-based methods can obtain a higher accuracy than voxel or point-based algorithms. To test this hypothesis, a 2D-to-3D reprojection method was developed and compared with three state-of-the-art 3D segmentation algorithms; Swin3D-s, Point Transformer v3 and MinkUNet34C. The 2D-to-3D method segmented images using Mask2Former, reprojected the predictions to the point cloud, and used a majority vote algorithm to merge multiple predictions. All algorithms were trained and tested to segment 3D point clouds into leaves, main stem, side stem, and pole. There was no significant difference between the 2D-to-3D, Swin3D-s and Point Transformer v3 algorithm, indicating that state-of-the-art voxel or point-based methods perform similar than our projection-based method. However, the 2D-to-3D method had a higher performance by including virtual cameras and it had a higher training efficiency. With only five annotated plants, a similar performance was obtained than training Swin3D-s on 25 plants indicating the added value of the developed pipeline.</div></div>","PeriodicalId":9173,"journal":{"name":"Biosystems Engineering","volume":"254 ","pages":"Article 104147"},"PeriodicalIF":4.4000,"publicationDate":"2025-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biosystems Engineering","FirstCategoryId":"97","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1537511025000832","RegionNum":1,"RegionCategory":"农林科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AGRICULTURAL ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Plant measurements are crucial to determine which plants grow optimal under certain conditions. These measurements can be done by hand, or automated using cameras, also known as image-based plant phenotyping. These images can be used to create point clouds to measure plant traits in 3D. To extract plant traits, accurate segmentation is crucial. Most point cloud segmentation methods rely on 3D segmentation algorithms. These algorithms are not as advanced and developed as 2D algorithms. In addition, 2D neural networks are pre-trained on large diverse datasets. In our work, it was therefore hypothesised that segmentation of point clouds using projection-based methods can obtain a higher accuracy than voxel or point-based algorithms. To test this hypothesis, a 2D-to-3D reprojection method was developed and compared with three state-of-the-art 3D segmentation algorithms; Swin3D-s, Point Transformer v3 and MinkUNet34C. The 2D-to-3D method segmented images using Mask2Former, reprojected the predictions to the point cloud, and used a majority vote algorithm to merge multiple predictions. All algorithms were trained and tested to segment 3D point clouds into leaves, main stem, side stem, and pole. There was no significant difference between the 2D-to-3D, Swin3D-s and Point Transformer v3 algorithm, indicating that state-of-the-art voxel or point-based methods perform similar than our projection-based method. However, the 2D-to-3D method had a higher performance by including virtual cameras and it had a higher training efficiency. With only five annotated plants, a similar performance was obtained than training Swin3D-s on 25 plants indicating the added value of the developed pipeline.
期刊介绍:
Biosystems Engineering publishes research in engineering and the physical sciences that represent advances in understanding or modelling of the performance of biological systems for sustainable developments in land use and the environment, agriculture and amenity, bioproduction processes and the food chain. The subject matter of the journal reflects the wide range and interdisciplinary nature of research in engineering for biological systems.