Waterway-BEV: Generate Bird’s Eye View Layouts of a Waterway From a First-Person View Camera Using Cross-View Transformers

IF 8.4 1区工程技术 Q1 ENGINEERING, CIVIL

IEEE Transactions on Intelligent Transportation Systems Pub Date : 2025-04-09 DOI:10.1109/TITS.2025.3554717

Feng Ma;Xin Jiang;Chen Chen;Jie Sun;Xin-Ping Yan;Jin Wang

{"title":"Waterway-BEV: Generate Bird’s Eye View Layouts of a Waterway From a First-Person View Camera Using Cross-View Transformers","authors":"Feng Ma;Xin Jiang;Chen Chen;Jie Sun;Xin-Ping Yan;Jin Wang","doi":"10.1109/TITS.2025.3554717","DOIUrl":null,"url":null,"abstract":"In the domain of autonomous ship navigation, the construction of bird’s-eye view (BEV) layouts for waterways has obvious significance. A helmsman can generate the BEV layout of the waterway using his/her eyes only. To simulate this intelligence, a novel neural network-based algorithm named Waterway-BEV is proposed, which enables reconstructing a local map formed by the waterway layout and ship occupancies in the bird’s-eye view given a first person view monocular image only. Waterway-BEV employs an efficient SEResNeXt encoder to extract features from first person view (FPV) monocular images, capturing deep semantic information related to waterways and ships. Due to the variations in information across different perspectives, Waterway-BEV incorporates a Cross-View Transformation Module, which takes the constraint of cycle consistency between views into account and makes full use of their correlation to strengthen the view transformation and scene understanding. To fully leverage the feature output of the SEResNeXt encoder, Waterway-BEV employs a decoder based on a dedicated lightweight network. This decoder is responsible for decoding the enhanced bird’s-eye view (BEV) feature maps and generating the BEV layout. By employing the Focal Loss as the loss function for model optimization, Waterway-BEV takes into account the quantity and classification difficulty of ship samples during the training process, thereby improving the generation performance and convergence speed. The experiments demonstrated that Waterway-BEV achieved notable performance metrics, with mIOU and mAP rates reaching 97.8% and 98.2%, respectively, in waterway bird’s-eye view layout generation. Waterway-BEV outperformed other state-of-the-art (SOTA) algorithms in generating BEV layouts of waterways. In particular, during specialized scenarios such as crossroads of waterways and tasks involving small target ships, Waterway-BEV consistently generated satisfactory bird’s-eye view layouts, demonstrating robustness and applicability.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 6","pages":"8078-8096"},"PeriodicalIF":8.4000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Intelligent Transportation Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10960549/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}

引用次数: 0

Abstract

In the domain of autonomous ship navigation, the construction of bird’s-eye view (BEV) layouts for waterways has obvious significance. A helmsman can generate the BEV layout of the waterway using his/her eyes only. To simulate this intelligence, a novel neural network-based algorithm named Waterway-BEV is proposed, which enables reconstructing a local map formed by the waterway layout and ship occupancies in the bird’s-eye view given a first person view monocular image only. Waterway-BEV employs an efficient SEResNeXt encoder to extract features from first person view (FPV) monocular images, capturing deep semantic information related to waterways and ships. Due to the variations in information across different perspectives, Waterway-BEV incorporates a Cross-View Transformation Module, which takes the constraint of cycle consistency between views into account and makes full use of their correlation to strengthen the view transformation and scene understanding. To fully leverage the feature output of the SEResNeXt encoder, Waterway-BEV employs a decoder based on a dedicated lightweight network. This decoder is responsible for decoding the enhanced bird’s-eye view (BEV) feature maps and generating the BEV layout. By employing the Focal Loss as the loss function for model optimization, Waterway-BEV takes into account the quantity and classification difficulty of ship samples during the training process, thereby improving the generation performance and convergence speed. The experiments demonstrated that Waterway-BEV achieved notable performance metrics, with mIOU and mAP rates reaching 97.8% and 98.2%, respectively, in waterway bird’s-eye view layout generation. Waterway-BEV outperformed other state-of-the-art (SOTA) algorithms in generating BEV layouts of waterways. In particular, during specialized scenarios such as crossroads of waterways and tasks involving small target ships, Waterway-BEV consistently generated satisfactory bird’s-eye view layouts, demonstrating robustness and applicability.

查看原文本刊更多论文

水道- bev：从第一人称视角使用交叉视角转换器生成水道的鸟瞰图布局

在船舶自主航行领域，航道鸟瞰图的构建具有十分重要的意义。舵手可以只用他/她的眼睛生成水道的BEV布局。为了模拟这种智能，提出了一种基于神经网络的水道- bev算法，该算法可以在给定第一人称视角的单目图像下重建由水道布局和船舶占用情况组成的局部地图。Waterway-BEV采用高效的SEResNeXt编码器从第一人称视角（FPV）单眼图像中提取特征，捕获与水道和船舶相关的深层语义信息。由于不同视角之间的信息存在差异，水路- bev集成了一个跨视图转换模块，该模块考虑了视图之间循环一致性的约束，并充分利用它们之间的相关性来加强视图转换和场景理解。为了充分利用SEResNeXt编码器的特征输出，water - bev采用了基于专用轻量级网络的解码器。该解码器负责解码增强的鸟瞰图（BEV）特征图并生成BEV布局。Waterway-BEV采用Focal Loss作为损失函数进行模型优化，在训练过程中考虑了船舶样本的数量和分类难度，提高了生成性能和收敛速度。实验表明，水道- bev在水道鸟瞰布局生成中取得了显著的性能指标，mIOU和mAP率分别达到97.8%和98.2%。在生成水道的BEV布局方面，水道-BEV优于其他最先进的（SOTA）算法。特别是，在水道十字路口和涉及小型目标船的任务等特殊场景中，水路- bev始终能够生成令人满意的鸟瞰图布局，显示出鲁棒性和适用性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Intelligent Transportation Systems 工程技术-工程：电子与电气

CiteScore

14.80

自引率

12.90%

发文量

1872

审稿时长

7.5 months

期刊介绍： The theoretical, experimental and operational aspects of electrical and electronics engineering and information technologies as applied to Intelligent Transportation Systems (ITS). Intelligent Transportation Systems are defined as those systems utilizing synergistic technologies and systems engineering concepts to develop and improve transportation systems of all kinds. The scope of this interdisciplinary activity includes the promotion, consolidation and coordination of ITS technical activities among IEEE entities, and providing a focus for cooperative activities, both internally and externally.