Pierre Leroy , Emmanuelle Abisset-Chavanne , Régis Pommier , Marco Montemurro
{"title":"An efficient deep learning strategy for real-time semantic segmentation of trees for embedded systems","authors":"Pierre Leroy , Emmanuelle Abisset-Chavanne , Régis Pommier , Marco Montemurro","doi":"10.1016/j.engappai.2025.111516","DOIUrl":null,"url":null,"abstract":"<div><div>Real-time segmentation plays a critical role in semantic simultaneous localization and mapping (SLAM) and autonomous navigation, where speed of inference is often prioritized over pixel-level accuracy. Existing segmentation models, such as “You Only Look Once” version 8 (YOLOv8) or tree detection and diameter estimation algorithm based on deep learning (known as “Perceptree”) are designed for generic use cases, leading to unnecessary computational overhead in structured environments such as managed pine forests. In this paper, we propose a lightweight and optimized method for real-time tree segmentation using red, green, blue, and depth channels (RGB-D) data. Our contribution is threefold. The first contribution focuses on depth-guided region proposal: we extract candidate regions from the depth map using mathematical filtering techniques, thus reducing the search space of the supervised model. The second one deals with the development of an embedded-friendly backbone: we simplify the YOLOv8 backbone while integrating depth information, improving inference speed without compromising key features for similarly shaped and sized objects. The last one focuses on the development of a compact segmentation head: instead of pixel-wise classification, we estimate polynomial coefficients to represent object contours, drastically reducing the number of parameters and accelerating inference. Our model achieves 53 frames per second on a ray tracing 2060 super graphics processing unit (GPU), which is 2.7 times faster than YOLOv8 and 10.8 times faster than Perceptree, while achieving a mean average precision score of 78.13% on real forest data.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111516"},"PeriodicalIF":8.0000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625015180","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Real-time segmentation plays a critical role in semantic simultaneous localization and mapping (SLAM) and autonomous navigation, where speed of inference is often prioritized over pixel-level accuracy. Existing segmentation models, such as “You Only Look Once” version 8 (YOLOv8) or tree detection and diameter estimation algorithm based on deep learning (known as “Perceptree”) are designed for generic use cases, leading to unnecessary computational overhead in structured environments such as managed pine forests. In this paper, we propose a lightweight and optimized method for real-time tree segmentation using red, green, blue, and depth channels (RGB-D) data. Our contribution is threefold. The first contribution focuses on depth-guided region proposal: we extract candidate regions from the depth map using mathematical filtering techniques, thus reducing the search space of the supervised model. The second one deals with the development of an embedded-friendly backbone: we simplify the YOLOv8 backbone while integrating depth information, improving inference speed without compromising key features for similarly shaped and sized objects. The last one focuses on the development of a compact segmentation head: instead of pixel-wise classification, we estimate polynomial coefficients to represent object contours, drastically reducing the number of parameters and accelerating inference. Our model achieves 53 frames per second on a ray tracing 2060 super graphics processing unit (GPU), which is 2.7 times faster than YOLOv8 and 10.8 times faster than Perceptree, while achieving a mean average precision score of 78.13% on real forest data.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.