嵌入式系统树的实时语义分割的高效深度学习策略

IF 8 2区计算机科学 Q1 AUTOMATION & CONTROL SYSTEMS

Engineering Applications of Artificial Intelligence Pub Date : 2025-07-02 DOI:10.1016/j.engappai.2025.111516

Pierre Leroy , Emmanuelle Abisset-Chavanne , Régis Pommier , Marco Montemurro

{"title":"嵌入式系统树的实时语义分割的高效深度学习策略","authors":"Pierre Leroy , Emmanuelle Abisset-Chavanne , Régis Pommier , Marco Montemurro","doi":"10.1016/j.engappai.2025.111516","DOIUrl":null,"url":null,"abstract":"<div><div>Real-time segmentation plays a critical role in semantic simultaneous localization and mapping (SLAM) and autonomous navigation, where speed of inference is often prioritized over pixel-level accuracy. Existing segmentation models, such as “You Only Look Once” version 8 (YOLOv8) or tree detection and diameter estimation algorithm based on deep learning (known as “Perceptree”) are designed for generic use cases, leading to unnecessary computational overhead in structured environments such as managed pine forests. In this paper, we propose a lightweight and optimized method for real-time tree segmentation using red, green, blue, and depth channels (RGB-D) data. Our contribution is threefold. The first contribution focuses on depth-guided region proposal: we extract candidate regions from the depth map using mathematical filtering techniques, thus reducing the search space of the supervised model. The second one deals with the development of an embedded-friendly backbone: we simplify the YOLOv8 backbone while integrating depth information, improving inference speed without compromising key features for similarly shaped and sized objects. The last one focuses on the development of a compact segmentation head: instead of pixel-wise classification, we estimate polynomial coefficients to represent object contours, drastically reducing the number of parameters and accelerating inference. Our model achieves 53 frames per second on a ray tracing 2060 super graphics processing unit (GPU), which is 2.7 times faster than YOLOv8 and 10.8 times faster than Perceptree, while achieving a mean average precision score of 78.13% on real forest data.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"159 ","pages":"Article 111516"},"PeriodicalIF":8.0000,"publicationDate":"2025-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An efficient deep learning strategy for real-time semantic segmentation of trees for embedded systems\",\"authors\":\"Pierre Leroy , Emmanuelle Abisset-Chavanne , Régis Pommier , Marco Montemurro\",\"doi\":\"10.1016/j.engappai.2025.111516\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Real-time segmentation plays a critical role in semantic simultaneous localization and mapping (SLAM) and autonomous navigation, where speed of inference is often prioritized over pixel-level accuracy. Existing segmentation models, such as “You Only Look Once” version 8 (YOLOv8) or tree detection and diameter estimation algorithm based on deep learning (known as “Perceptree”) are designed for generic use cases, leading to unnecessary computational overhead in structured environments such as managed pine forests. In this paper, we propose a lightweight and optimized method for real-time tree segmentation using red, green, blue, and depth channels (RGB-D) data. Our contribution is threefold. The first contribution focuses on depth-guided region proposal: we extract candidate regions from the depth map using mathematical filtering techniques, thus reducing the search space of the supervised model. The second one deals with the development of an embedded-friendly backbone: we simplify the YOLOv8 backbone while integrating depth information, improving inference speed without compromising key features for similarly shaped and sized objects. The last one focuses on the development of a compact segmentation head: instead of pixel-wise classification, we estimate polynomial coefficients to represent object contours, drastically reducing the number of parameters and accelerating inference. Our model achieves 53 frames per second on a ray tracing 2060 super graphics processing unit (GPU), which is 2.7 times faster than YOLOv8 and 10.8 times faster than Perceptree, while achieving a mean average precision score of 78.13% on real forest data.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"159 \",\"pages\":\"Article 111516\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197625015180\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625015180","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

实时分割在语义同步定位和映射（SLAM）和自主导航中起着至关重要的作用，其中推理速度通常优先于像素级精度。现有的分割模型，如“You Only Look Once”version 8 （YOLOv8）或基于深度学习的树木检测和直径估计算法（称为“Perceptree”）是为通用用例设计的，这会导致结构化环境（如管理松林）中不必要的计算开销。在本文中，我们提出了一种轻量级的优化方法，用于使用红、绿、蓝和深度通道（RGB-D）数据进行实时树木分割。我们的贡献是三重的。第一个贡献集中在深度引导区域建议上：我们使用数学滤波技术从深度图中提取候选区域，从而减少了监督模型的搜索空间。第二部分涉及嵌入式友好骨干的开发：我们在集成深度信息的同时简化了YOLOv8骨干，在不影响类似形状和大小对象的关键特征的情况下提高了推理速度。最后一个重点是紧凑分割头的开发：我们不是逐像素分类，而是估计多项式系数来表示对象轮廓，从而大大减少了参数数量并加速了推理。我们的模型在光线追踪2060超级图形处理单元（GPU）上实现了每秒53帧，比YOLOv8快2.7倍，比Perceptree快10.8倍，同时在真实森林数据上实现了78.13%的平均精度得分。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An efficient deep learning strategy for real-time semantic segmentation of trees for embedded systems

Real-time segmentation plays a critical role in semantic simultaneous localization and mapping (SLAM) and autonomous navigation, where speed of inference is often prioritized over pixel-level accuracy. Existing segmentation models, such as “You Only Look Once” version 8 (YOLOv8) or tree detection and diameter estimation algorithm based on deep learning (known as “Perceptree”) are designed for generic use cases, leading to unnecessary computational overhead in structured environments such as managed pine forests. In this paper, we propose a lightweight and optimized method for real-time tree segmentation using red, green, blue, and depth channels (RGB-D) data. Our contribution is threefold. The first contribution focuses on depth-guided region proposal: we extract candidate regions from the depth map using mathematical filtering techniques, thus reducing the search space of the supervised model. The second one deals with the development of an embedded-friendly backbone: we simplify the YOLOv8 backbone while integrating depth information, improving inference speed without compromising key features for similarly shaped and sized objects. The last one focuses on the development of a compact segmentation head: instead of pixel-wise classification, we estimate polynomial coefficients to represent object contours, drastically reducing the number of parameters and accelerating inference. Our model achieves 53 frames per second on a ray tracing 2060 super graphics processing unit (GPU), which is 2.7 times faster than YOLOv8 and 10.8 times faster than Perceptree, while achieving a mean average precision score of 78.13% on real forest data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Engineering Applications of Artificial Intelligence 工程技术-工程：电子与电气

CiteScore

9.60

自引率

10.00%

发文量

505

审稿时长

68 days

期刊介绍： Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.