Towards complete tree crown delineation by instance segmentation with Mask R–CNN and DETR using UAV-based multispectral imagery and lidar data

ISPRS Open Journal of Photogrammetry and Remote Sensing Pub Date : 2023-04-01 DOI:10.1016/j.ophoto.2023.100037

S. Dersch , A. Schöttl , P. Krzystek , M. Heurich

{"title":"Towards complete tree crown delineation by instance segmentation with Mask R–CNN and DETR using UAV-based multispectral imagery and lidar data","authors":"S. Dersch , A. Schöttl , P. Krzystek , M. Heurich","doi":"10.1016/j.ophoto.2023.100037","DOIUrl":null,"url":null,"abstract":"<div>Precise single tree delineation allows for a more reliable determination of essential parameters such as tree species, height and vitality. Methods of instance segmentation are powerful neural networks for detecting and segmenting single objects and have the potential to push the accuracy of tree segmentation methods to a new level. In this study, two instance segmentation methods, Mask R–CNN and DETR, were applied to precisely delineate single tree crowns using multispectral images and images generated from UAV lidar data. The study area was in Bavaria, 35 km north of Munich (Germany), comprising a mixed forest stand of around 7 ha characterised mainly by Norway spruce (Picea abies) and large groups of European beeches (Fagus sylvatica) with 181–236 trees per ha. The data set, consisting of multispectral images and lidar data, was acquired using a Micasense RedEdge-MX dual camera system and a Riegl miniVUX-1UAV lidar scanner, both mounted on a hexacopter (DJI Matrice 600 Pro). At an altitude of approximately 85 m, two flight missions were conducted at an airspeed of 5 m/s, leading to a ground resolution of 5 cm and a lidar point density of 560 points/m2. In total, 1408 trees were marked by visual interpretation of the remote sensing data for training and validating the classifiers. Additionally, 125 trees were surveyed by tacheometric means used to test the optimized neural networks. The evaluations showed that segmentation using only multispectral imagery performed slightly better than with images generated from lidar data. In terms of F1 score, Mask R–CNN with color infrared (CIR) images achieved 92% in coniferous, 85% in deciduous and 83% in mixed stands. Compared to the images generated by lidar data, these scores are the same for coniferous and slightly worse for deciduous and mixed plots, by 4% and 2%, respectively. DETR with CIR images achieved 90% in coniferous, 81% in deciduous and 84% in mixed stands. These scores were 2%, 1%, and 2% worse, respectively, compared to the lidar data images in the same test areas. Interestingly, four conventional segmentation methods performed significantly worse than CIR-based and lidar-based instance segmentations. Additionally, the results revealed that tree crowns were more accurately segmented by instance segmentation. All in all, the results highlight the practical potential of the two deep learning-based tree segmentation methods, especially in comparison to baseline methods.</div>","PeriodicalId":100730,"journal":{"name":"ISPRS Open Journal of Photogrammetry and Remote Sensing","volume":"8 ","pages":"Article 100037"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISPRS Open Journal of Photogrammetry and Remote Sensing","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S266739322300008X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Precise single tree delineation allows for a more reliable determination of essential parameters such as tree species, height and vitality. Methods of instance segmentation are powerful neural networks for detecting and segmenting single objects and have the potential to push the accuracy of tree segmentation methods to a new level. In this study, two instance segmentation methods, Mask R–CNN and DETR, were applied to precisely delineate single tree crowns using multispectral images and images generated from UAV lidar data. The study area was in Bavaria, 35 km north of Munich (Germany), comprising a mixed forest stand of around 7 ha characterised mainly by Norway spruce (Picea abies) and large groups of European beeches (Fagus sylvatica) with 181–236 trees per ha. The data set, consisting of multispectral images and lidar data, was acquired using a Micasense RedEdge-MX dual camera system and a Riegl miniVUX-1UAV lidar scanner, both mounted on a hexacopter (DJI Matrice 600 Pro). At an altitude of approximately 85 m, two flight missions were conducted at an airspeed of 5 m/s, leading to a ground resolution of 5 cm and a lidar point density of 560 points/m². In total, 1408 trees were marked by visual interpretation of the remote sensing data for training and validating the classifiers. Additionally, 125 trees were surveyed by tacheometric means used to test the optimized neural networks. The evaluations showed that segmentation using only multispectral imagery performed slightly better than with images generated from lidar data. In terms of F1 score, Mask R–CNN with color infrared (CIR) images achieved 92% in coniferous, 85% in deciduous and 83% in mixed stands. Compared to the images generated by lidar data, these scores are the same for coniferous and slightly worse for deciduous and mixed plots, by 4% and 2%, respectively. DETR with CIR images achieved 90% in coniferous, 81% in deciduous and 84% in mixed stands. These scores were 2%, 1%, and 2% worse, respectively, compared to the lidar data images in the same test areas. Interestingly, four conventional segmentation methods performed significantly worse than CIR-based and lidar-based instance segmentations. Additionally, the results revealed that tree crowns were more accurately segmented by instance segmentation. All in all, the results highlight the practical potential of the two deep learning-based tree segmentation methods, especially in comparison to baseline methods.

查看原文本刊更多论文

利用基于无人机的多光谱图像和激光雷达数据，利用掩模R-CNN和DETR进行实例分割，实现完整的树冠描绘

精确的单株树木描绘可以更可靠地确定基本参数，如树种、高度和活力。实例分割方法是用于检测和分割单个对象的强大神经网络，有可能将树分割方法的准确性提高到一个新的水平。在本研究中，使用多光谱图像和无人机激光雷达数据生成的图像，应用Mask R–CNN和DETR两种实例分割方法来精确描绘单个树冠。研究区域位于慕尼黑（德国）以北35公里的巴伐利亚州，包括一个约7公顷的混合林分，主要以挪威云杉（云杉）和大量欧洲山毛榉（山毛榉）为特征，每公顷181–236棵树。该数据集由多光谱图像和激光雷达数据组成，使用安装在六旋翼机（DJI Matrice 600 Pro）上的Micasense RedEdge MX双摄像头系统和Riegl miniVUX-1UAV激光雷达扫描仪采集。在大约85米的高度，以5米/秒的空速执行了两次飞行任务，导致地面分辨率为5厘米，激光雷达点密度为560点/平方米。总共有1408棵树通过遥感数据的视觉解释进行了标记，用于训练和验证分类器。此外，通过速度测量方法对125棵树进行了调查，用于测试优化的神经网络。评估显示，仅使用多光谱图像的分割效果略好于使用激光雷达数据生成的图像。就F1得分而言，Mask R–CNN彩色红外图像在针叶林中达到92%，在落叶林中达到85%，在混交林中达到83%。与激光雷达数据生成的图像相比，针叶林的得分相同，落叶和混合林的得分略差，分别为4%和2%。采用CIR图像的DETR在针叶林中达到90%，在落叶林中达到81%，在混交林中达到84%。与相同测试区域的激光雷达数据图像相比，这些分数分别差2%、1%和2%。有趣的是，四种传统的分割方法的表现明显不如基于CIR和基于激光雷达的实例分割。此外，结果表明，通过实例分割可以更准确地分割树冠。总之，结果突出了两种基于深度学习的树分割方法的实用潜力，尤其是与基线方法相比。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ISPRS Open Journal of Photogrammetry and Remote Sensing

CiteScore

5.10

自引率

0.00%

发文量