{"title":"New generation thermal traffic sensor: A novel dataset and monocular 3D thermal vision framework","authors":"Arnd Pettirsch , Alvaro Garcia-Hernandez","doi":"10.1016/j.knosys.2025.113334","DOIUrl":null,"url":null,"abstract":"<div><div>Applications like traffic safety analysis require highly accurate trajectory data in world coordinates of traffic participants. While systems like LiDAR or stereo cameras can provide such data, they are costly, sensitive to weather and lighting conditions, and may raise privacy concerns. Thermal roadside cameras offer a robust, privacy-compliant alternative. However, monocular thermal cameras face challenges due to the ambiguous relationship between pixel coordinates and world coordinates. Existing methods for monocular 3D detection from RGB roadside cameras often rely on simplifications or the complex task of depth estimation, which limits their effectiveness. Additionally, no dataset currently exists for monocular 3D detection using thermal roadside imagery. This work introduces a dataset of 9,591 thermal images annotated in 3D world coordinates, including detailed camera calibration and surface models. It proposes a lightweight neural network architecture leveraging a projection-based method to incorporate road surface information. By detecting bottom-center contact points in image space and projecting them into 3D, the presented framework efficiently estimates object's position, dimensions, and orientations in 3D. The presented approach outperforms homography-based methods by 25 percentage points in mean average precision (mAP). It achieves real-time performance with 54 FPS on a GPU server and 17 FPS on an NVIDIA Jetson Xavier NX, making it suitable for edge deployment. Unlike RGB-based systems, our method ensures data privacy and remains effective in diverse weather and lighting conditions, enabling reliable trajectory analysis and near-miss detection for traffic safety applications. Readers can find the dataset here: <span><span>https://doi.org/10.17632/tw6ghtv624.1</span><svg><path></path></svg></span>. The code used in this work is available here: <span><span>https://github.com/4rnd25/new_generation_thermal_traffic_sensor</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"315 ","pages":"Article 113334"},"PeriodicalIF":7.2000,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125003818","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Applications like traffic safety analysis require highly accurate trajectory data in world coordinates of traffic participants. While systems like LiDAR or stereo cameras can provide such data, they are costly, sensitive to weather and lighting conditions, and may raise privacy concerns. Thermal roadside cameras offer a robust, privacy-compliant alternative. However, monocular thermal cameras face challenges due to the ambiguous relationship between pixel coordinates and world coordinates. Existing methods for monocular 3D detection from RGB roadside cameras often rely on simplifications or the complex task of depth estimation, which limits their effectiveness. Additionally, no dataset currently exists for monocular 3D detection using thermal roadside imagery. This work introduces a dataset of 9,591 thermal images annotated in 3D world coordinates, including detailed camera calibration and surface models. It proposes a lightweight neural network architecture leveraging a projection-based method to incorporate road surface information. By detecting bottom-center contact points in image space and projecting them into 3D, the presented framework efficiently estimates object's position, dimensions, and orientations in 3D. The presented approach outperforms homography-based methods by 25 percentage points in mean average precision (mAP). It achieves real-time performance with 54 FPS on a GPU server and 17 FPS on an NVIDIA Jetson Xavier NX, making it suitable for edge deployment. Unlike RGB-based systems, our method ensures data privacy and remains effective in diverse weather and lighting conditions, enabling reliable trajectory analysis and near-miss detection for traffic safety applications. Readers can find the dataset here: https://doi.org/10.17632/tw6ghtv624.1. The code used in this work is available here: https://github.com/4rnd25/new_generation_thermal_traffic_sensor.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.