{"title":"Multi-sensor data fusion across dimensions: A novel approach to synopsis generation using sensory data","authors":"Palash Yuvraj Ingle, Young-Gab Kim","doi":"10.1016/j.jii.2025.100876","DOIUrl":null,"url":null,"abstract":"<div><div>Unmanned aerial vehicles (UAVs) and autonomous ground vehicles are increasingly outfitted with advanced sensors such as LiDAR, cameras, and GPS, enabling real-time object detection, tracking, localization, and navigation. These platforms generate high-volume sensory data, such as video streams and point clouds, that require efficient processing to support timely and informed decision-making. Although video synopsis techniques are widely used for visual data summarization, they encounter significant challenges in multi-sensor environments due to disparities in sensor modalities. To address these limitations, we propose a novel sensory data synopsis framework designed for both UAV and autonomous vehicle applications. The proposed system integrates a dual-task learning model with a real-time sensor fusion module to jointly perform abnormal object segmentation and depth estimation by combining LiDAR and camera data. The framework comprises a sensory fusion algorithm, a 3D-to-2D projection mechanism, and a Metropolis-Hastings-based trajectory optimization strategy to refine object tubes and construct concise, temporally-shifted synopses. This design selectively preserves and repositions salient information across space and time, enhancing synopsis clarity while reducing computational overhead. Experimental evaluations conducted on standard datasets (i.e., KITTI, Cityscapes, and DVS) demonstrate that our framework achieves a favorable balance between segmentation accuracy and inference speed. In comparison with existing studies, it yields superior performance in terms of frame reduction, recall, and F1 score. The results highlight the robustness, real-time capability, and broad applicability of the proposed approach to intelligent surveillance, smart infrastructure, and autonomous mobility systems.</div></div>","PeriodicalId":55975,"journal":{"name":"Journal of Industrial Information Integration","volume":"46 ","pages":"Article 100876"},"PeriodicalIF":10.4000,"publicationDate":"2025-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Industrial Information Integration","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2452414X25000998","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
Unmanned aerial vehicles (UAVs) and autonomous ground vehicles are increasingly outfitted with advanced sensors such as LiDAR, cameras, and GPS, enabling real-time object detection, tracking, localization, and navigation. These platforms generate high-volume sensory data, such as video streams and point clouds, that require efficient processing to support timely and informed decision-making. Although video synopsis techniques are widely used for visual data summarization, they encounter significant challenges in multi-sensor environments due to disparities in sensor modalities. To address these limitations, we propose a novel sensory data synopsis framework designed for both UAV and autonomous vehicle applications. The proposed system integrates a dual-task learning model with a real-time sensor fusion module to jointly perform abnormal object segmentation and depth estimation by combining LiDAR and camera data. The framework comprises a sensory fusion algorithm, a 3D-to-2D projection mechanism, and a Metropolis-Hastings-based trajectory optimization strategy to refine object tubes and construct concise, temporally-shifted synopses. This design selectively preserves and repositions salient information across space and time, enhancing synopsis clarity while reducing computational overhead. Experimental evaluations conducted on standard datasets (i.e., KITTI, Cityscapes, and DVS) demonstrate that our framework achieves a favorable balance between segmentation accuracy and inference speed. In comparison with existing studies, it yields superior performance in terms of frame reduction, recall, and F1 score. The results highlight the robustness, real-time capability, and broad applicability of the proposed approach to intelligent surveillance, smart infrastructure, and autonomous mobility systems.
期刊介绍:
The Journal of Industrial Information Integration focuses on the industry's transition towards industrial integration and informatization, covering not only hardware and software but also information integration. It serves as a platform for promoting advances in industrial information integration, addressing challenges, issues, and solutions in an interdisciplinary forum for researchers, practitioners, and policy makers.
The Journal of Industrial Information Integration welcomes papers on foundational, technical, and practical aspects of industrial information integration, emphasizing the complex and cross-disciplinary topics that arise in industrial integration. Techniques from mathematical science, computer science, computer engineering, electrical and electronic engineering, manufacturing engineering, and engineering management are crucial in this context.