{"title":"Knowledge Guided Visual Transformers for Intelligent Transportation Systems","authors":"Asma Belhadi;Youcef Djenouri;Ahmed Nabil Belbachir;Tomasz Michalak;Gautam Srivastava","doi":"10.1109/TITS.2024.3520487","DOIUrl":"https://doi.org/10.1109/TITS.2024.3520487","url":null,"abstract":"We present a novel approach for addressing computer vision tasks in intelligent transportation systems, with a strong focus on data security during training through federated learning. Our method leverages visual transformers, training multiple models for each image. By calculating and storing visual image features as well as loss values, we propose a novel Shapley value model based on model performance consistency to select the most appropriate models during testing. To enhance security, we introduce an intelligent federated learning strategy, where users are grouped into clusters based on constrastive clustering for creating a global model as well as customized local models. Users receive both global as well as local models, enabling tailored computer vision applications. We evaluated KGVT-ITS (Knowledge Guided Visual Transformers for Intelligent Transportation Systems) on various ITS challenges, including pedestrian detection, abnormal event detection, as well as near-crash detection. The results demonstrate the superiority of KGVT-ITS over baseline solutions, showcasing its effectiveness and robustness in intelligent transportation scenarios. More particularly, KGVT-ITS achieves significant improvements of about 8% against the existing ITS methods.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 3","pages":"3341-3349"},"PeriodicalIF":7.9,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ruizhe Lu;Junpeng Shang;Jun Wu;Yihe Wang;Dongfang Ma
{"title":"Panoramic Sea-Ice Map Construction for Polar Navigation Based on Multi-Perspective Images Projection and Camera Poses Rectification","authors":"Ruizhe Lu;Junpeng Shang;Jun Wu;Yihe Wang;Dongfang Ma","doi":"10.1109/TITS.2024.3523287","DOIUrl":"https://doi.org/10.1109/TITS.2024.3523287","url":null,"abstract":"Panoramic map construction of polar sea ice can provide significant assistance for intelligent navigation and routing planning in polar regions. Traditional methods for panoramic observation and map generation suffer from issues such as limited parallax tolerance, poor stitching robustness, and low mapping accuracy. In this paper, these problems are addressed by a proposed online panoramic method based on multi-perspective image projection and camera pose rectification. The proposed method incorporates a modified inverse projection module that dynamically adapts to the ship’s attitude, thereby stably restoring the sea-ice images into bird’s-eye view (BEV) in real-time. In order to resolve the challenging sea-ice feature alignment under significant parallax, a planar feature registration method is proposed which robustly aligns the features between the projected images. Moreover, a camera pose rectification module is specifically designed for planar projection tasks, which virtually adjusts the camera’s extrinsic parameters for obtaining more precise and high-quality panoramic sea-ice maps. Finally, online construction of global sea-ice field maps is achieved by fusing the local maps during navigation. Extensive qualitative and quantitative experiments demonstrate that the proposed method outperforms other panoramic methods in terms of map accuracy and stitching quality. Additionally, the proposed method is more suitable for downstream tasks, including polar simultaneous localization and mapping (SLAM) and path planning.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 3","pages":"3417-3430"},"PeriodicalIF":7.9,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Decoupling Objectives for Segmented Path Planning: A Subtask-Oriented Trajectory Planning Approach","authors":"Guangliang Liao;Chunyun Fu;Yinghong Yu;Kexue Lai;Beihao Xia;Jingkang Xia","doi":"10.1109/TITS.2024.3518915","DOIUrl":"https://doi.org/10.1109/TITS.2024.3518915","url":null,"abstract":"Local trajectory planning (TP) for collision avoidance typically comprises path planning (PP) and velocity planning (VP). Various objectives must be fulfilled in a PP task, and the majority of prior works integrate all objectives into a unified cost function. To prioritize the dominant objectives of each PP stage, we propose to decouple the PP task into two separate subtasks, enabling the logical establishment of subtask-oriented segmented PP methods. First, based on risk evaluation of four vehicle vertices, an improved artificial potential field was established. Second, a novel transit point selection method was applied to decouple the PP task into two segmented subtasks. Then, the optimization problem was converted into a multi-attribute decision-making (MADM) problem and the technique for order preference by similarity to ideal solution (TOPSIS) method was utilized to obtain two optimal segmented paths. Finally, a velocity planner based on cubic polynomial, in conjunction with a support vector machine-based stability classifier, was designed. The proposed trajectory planner was then verified in six typical driving scenarios, including both simulation and real vehicle studies. Verification results demonstrate that the proposed planner effectively decouples the PP task and achieves a safe, comfortable, efficient, and trackable trajectory.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 3","pages":"3960-3975"},"PeriodicalIF":7.9,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hong Zhu;Jialong Feng;Fengmei Sun;Keshuang Tang;Di Zang;Qi Kang
{"title":"Sharing Control Knowledge Among Heterogeneous Intersections: A Distributed Arterial Traffic Signal Coordination Method Using Multi-Agent Reinforcement Learning","authors":"Hong Zhu;Jialong Feng;Fengmei Sun;Keshuang Tang;Di Zang;Qi Kang","doi":"10.1109/TITS.2024.3521514","DOIUrl":"https://doi.org/10.1109/TITS.2024.3521514","url":null,"abstract":"Treating each intersection as basic agent, multi-agent reinforcement learning (MARL) methods have emerged as the predominant approach for distributed adaptive traffic signal control (ATSC) in multi-intersection scenarios, such as arterial coordination. MARL-based ATSC currently faces two challenges: disturbances from the control policies of other intersections may impair the learning and control stability of the agents; and the heterogeneous features across intersections may complicate coordination efforts. To address these challenges, this study proposes a novel MARL method for distributed ATSC in arterials, termed the Distributed Controller for Heterogeneous Intersections (DCHI). The DCHI method introduces a Neighborhood Experience Sharing (NES) framework, wherein each agent utilizes both local data and shared experiences from adjacent intersections to improve its control policy. Within this framework, the neural networks of each agent are partitioned into two parts following the Knowledge Homogenizing Encapsulation (KHE) mechanism. The first part manages heterogeneous intersection features and transforms the control experiences, while the second part optimizes homogeneous control logic. Experimental results demonstrate that the proposed DCHI achieves efficiency improvements in average travel time of over 30% compared to traditional methods and yields similar performance to the centralized sharing method. Furthermore, vehicle trajectories reveal that DCHI can adaptively establish green wave bands in a distributed manner. Given its superior control performance, accommodation of heterogeneous intersections, and low reliance on information networks, DCHI could significantly advance the application of MARL-based ATSC methods in practice.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 2","pages":"2760-2776"},"PeriodicalIF":7.9,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143183767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Semantically Adversarial Scene Generation With Explicit Knowledge Guidance","authors":"Wenhao Ding;Haohong Lin;Bo Li;Ding Zhao","doi":"10.1109/TITS.2024.3510515","DOIUrl":"https://doi.org/10.1109/TITS.2024.3510515","url":null,"abstract":"Generating adversarial scenes that potentially fail autonomous driving systems provides an effective way to improve their robustness. Extending purely data-driven generative models, recent specialized models satisfy additional controllable requirements such as embedding a traffic sign in a driving scene by manipulating patterns implicitly at the neuron level. In this paper, we introduce a method to incorporate domain knowledge explicitly in the generation process to achieve Semantically Adversarial Generation (SAG). To be consistent with the composition of driving scenes, we first categorize the knowledge into two types, the property of objects and the relationship among objects. We then propose a tree-structured variational auto-encoder (T-VAE) to learn hierarchical scene representation. By imposing semantic rules on the properties of nodes and edges into the tree structure, explicit knowledge integration enables controllable generation. To demonstrate the advantage of structural representation, we construct a synthetic example to illustrate the controllability and explainability of our method in a succinct setting. We further extend to realistic environments for autonomous vehicles, showing that our method efficiently identifies adversarial driving scenes against different state-of-the-art 3D point cloud segmentation models and satisfies the constraints specified as explicit knowledge.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 2","pages":"1510-1521"},"PeriodicalIF":7.9,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143183814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Explainable Q-Learning Method for Longitudinal Control of Autonomous Vehicles","authors":"Meng Li;Zhihao Cui;Yulei Wang;Yanjun Huang;Hong Chen","doi":"10.1109/TITS.2024.3521385","DOIUrl":"https://doi.org/10.1109/TITS.2024.3521385","url":null,"abstract":"Various artificial intelligence (AI) algorithms have been developed for autonomous vehicles (AVs) to support environmental perception, decision making and automated driving in real-world scenarios. Existing AI methods, such as deep learning and deep reinforcement learning, have been criticized due to their black box nature. Explainable AI technologies are important for assisting users in understanding vehicle behaviors to ensure that users trust, accept, and rely on AI devices. In this paper, an explainable Q-learning method for AV longitudinal control is proposed. First, AI control of AVs is realized by constructing a deep Q-network (DQN) with an intelligent driver model, with the control objective maximizing vehicle speed while preventing collisions. Then, a deep explainer for humans is developed via a Shapley additive explanation (SHAP), and a novel positive SHAP method that defines new base values is proposed to explain how individual state features contribute to decisions. Finally, statistical analyses and intuitive explanations are quantified based on SHAP tools to improve clarity. Elaborate numerical simulations are conducted to demonstrate the effectiveness of the proposed algorithm. The code is available at <uri>https://github.com/limeng-1234/Pos_Shap</uri>.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 3","pages":"4214-4218"},"PeriodicalIF":7.9,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Feng Xue;Yicong Chang;Wenzhuang Xu;Wenteng Liang;Fei Sheng;Anlong Ming
{"title":"Evidence-Based Real-Time Road Segmentation With RGB-D Data Augmentation","authors":"Feng Xue;Yicong Chang;Wenzhuang Xu;Wenteng Liang;Fei Sheng;Anlong Ming","doi":"10.1109/TITS.2024.3509140","DOIUrl":"https://doi.org/10.1109/TITS.2024.3509140","url":null,"abstract":"Despite significant progress in RGB-D based road segmentation in recent years, the latest methods cannot achieve both state-of-the-art accuracy and real time due to the high-performance reliance on heavy structures. We argue that this reliance is due to unsuitable multimodal fusion. To be specific, RGB and depth data in road scenes are each sensitive to different regions, but current RGB-D based road segmentation methods generally combine features within sensitive regions which preserves false road representation from one of the data. Based on such findings, we design an Evidence-based Road Segmentation Method (Evi-RoadSeg), which incorporates prior knowledge of the modal-specific characteristics. Firstly, we abandon the cross-modal fusion operation commonly used in existing multimodal based methods. Instead, we collect the road evidence from RGB and depth inputs separately via two low-latency subnetworks, and fuse the road representation of the two subnetworks by taking both modalities’ evidence as a measure of confidence. Secondly, we propose an RGB-D data augmentation scheme tailored to road scenes to enhance the unique properties of RGB and depth data. It facilitates learning by adding more sensitive regions to the samples. Finally, the proposed method is evaluated on the widely used KITTI-road, ORFD, and R2D datasets. Our method achieves state-of-the-art accuracy at over 70 FPS, <inline-formula> <tex-math>$5times $ </tex-math></inline-formula> faster than comparable RGB-D methods. Furthermore, extensive experiments illustrate that our method can be deployed on a Jetson Nano 2GB with a speed of 8+ FPS. The code will be released in <uri>https://github.com/xuefeng-cvr/Evi-RoadSeg</uri>.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 2","pages":"1482-1493"},"PeriodicalIF":7.9,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143183810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integrating GPU-Accelerated for Fast Large-Scale Vessel Trajectories Visualization in Maritime IoT Systems","authors":"Maohan Liang;Kezhong Liu;Ruobin Gao;Yan Li","doi":"10.1109/TITS.2024.3521050","DOIUrl":"https://doi.org/10.1109/TITS.2024.3521050","url":null,"abstract":"With the advancement of satellite communication technology, the maritime Internet of Things (IoT) has made significant progress. As a result, vast amounts of Automatic Identification System (AIS) data from global vessels are transmitted to various maritime stakeholders through Maritime IoT systems. AIS data contains a large amount of dynamic and static information that requires effective and intuitive visualization for comprehensive analysis. However, two major deficiencies challenge current visualization models: a lack of consideration for interactions between distant pixels and low efficiency. To address these issues, we developed a large-scale vessel trajectories visualization algorithm, called the Non-local Kernel Density Estimation (NLKDE) algorithm, which incorporates a non-local convolution process. It accurately calculates the density distribution of vessel trajectories by considering correlations between distant pixels. Additionally, we implemented the NLKDE algorithm under a Graphics Processing Unit (GPU) framework to enable parallel computing and improve operational efficiency. Comprehensive experiments using multiple vessel trajectory datasets show that the NLKDE algorithm excels in vessel trajectory density visualization tasks, and the GPU-accelerated framework significantly shortens the execution time to achieve real-time results. From both theoretical and practical perspectives, GPU-accelerated NLKDE provides technical support for real-time monitoring of vessel dynamics in complex water areas and contributes to constructing maritime intelligent transportation systems. The code for this paper can be accessed at: <uri>https://github.com/maohliang/GPU-NLKDE</uri>.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 3","pages":"4048-4065"},"PeriodicalIF":7.9,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Miguel Costa;Manuel Marques;Carlos Lima Azevedo;Felix Wilhelm Siebert;Filipe Moura
{"title":"Which Cycling Environment Appears Safer? Learning Cycling Safety Perceptions From Pairwise Image Comparisons","authors":"Miguel Costa;Manuel Marques;Carlos Lima Azevedo;Felix Wilhelm Siebert;Filipe Moura","doi":"10.1109/TITS.2024.3507639","DOIUrl":"https://doi.org/10.1109/TITS.2024.3507639","url":null,"abstract":"Cycling is critical for cities to transition to more sustainable transport modes. Yet, safety concerns remain a critical deterrent for individuals to cycle. If individuals perceive an environment as unsafe for cycling, it is likely that they will prefer other means of transportation. Yet, capturing and understanding how individuals perceive cycling risk is complex and often slow, with researchers defaulting to traditional surveys and in-loco interviews. In this study, we tackle this problem. We base our approach on using pairwise comparisons of real-world images, repeatedly presenting respondents with pairs of road environments and asking them to select the one they perceive as safer for cycling, if any. Using the collected data, we train a siamese-convolutional neural network using a multi-loss framework that learns from individuals’ responses, learns preferences directly from images, and includes ties (often discarded in the literature). Effectively, this model learns to predict human-style perceptions, evaluating which cycling environments are perceived as safer. Our model achieves good results, showcasing this approach has a real-life impact, such as improving interventions’ effectiveness. Furthermore, it facilitates the continuous assessment of changing cycling environments, permitting short-term evaluations of measures to enhance perceived cycling safety. Finally, our method can be efficiently deployed in different locations with a growing number of openly available street-view images.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 2","pages":"1689-1700"},"PeriodicalIF":7.9,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143183860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinsheng Xiao;Shurui Wang;Jian Zhou;Ziyue Tian;Hongping Zhang;Yuan-Fang Wang
{"title":"MIM: High-Definition Maps Incorporated Multi-View 3D Object Detection","authors":"Jinsheng Xiao;Shurui Wang;Jian Zhou;Ziyue Tian;Hongping Zhang;Yuan-Fang Wang","doi":"10.1109/TITS.2024.3520814","DOIUrl":"https://doi.org/10.1109/TITS.2024.3520814","url":null,"abstract":"3D object detection has aroused increasing interest as a crucial component of autonomous driving systems. While recent works have explored various multi-modal fusion methods to enhance accuracy and robustness, fusing multi-view images and high-definition (HD) maps remains uncharted. Inspired by our previous work, we endeavor to introduce HD maps to camera-based detection, prompting the design of a new framework. To address this, we first analyze the function of HD maps in object detection to understand their benefits and the rationale for their fusion. From this analysis, we identify key disparities in view, semantics, and scale, leading to the development of MIM, a framework for HD Maps Incorporated Multi-view 3D object detection. HD maps are enriched in semantics by sampling unlabeled areas and encoding them into map features. Simultaneously, multi-view images are transformed into features in bird’s-eye view (BEV) using the adopted baseline. These features are then fused using attention mechanisms to align scales. Experiments conducted on the nuScenes dataset demonstrate that MIM outperforms camera-based methods. Moreover, an in-depth analysis investigates how HD maps impact object detection regarding each semantic layer. The results underscore the operational intricacies of HD maps in perception, setting the stage for future research. Code is available at <uri>https://github.com/WHU-xjs/MIM-3D-Det</uri>.","PeriodicalId":13416,"journal":{"name":"IEEE Transactions on Intelligent Transportation Systems","volume":"26 3","pages":"3989-4001"},"PeriodicalIF":7.9,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143535465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}