IFGAN--融合三维点云传感数据的新型图像融合模型

IF 3.3 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
H. Ignatious, Hesham El-Sayed, Salah Bouktif
{"title":"IFGAN--融合三维点云传感数据的新型图像融合模型","authors":"H. Ignatious, Hesham El-Sayed, Salah Bouktif","doi":"10.3390/jsan13010015","DOIUrl":null,"url":null,"abstract":"To enhance the level of autonomy in driving, it is crucial to ensure optimal execution of critical maneuvers in all situations. However, numerous accidents involving autonomous vehicles (AVs) developed by major automobile manufacturers in recent years have been attributed to poor decision making caused by insufficient perception of environmental information. AVs employ diverse sensors in today’s technology-driven settings to gather this information. However, due to technical and natural factors, the data collected by these sensors may be incomplete or ambiguous, leading to misinterpretation by AVs and resulting in fatal accidents. Furthermore, environmental information obtained from multiple sources in the vehicular environment often exhibits multimodal characteristics. To address this limitation, effective preprocessing of raw sensory data becomes essential, involving two crucial tasks: data cleaning and data fusion. In this context, we propose a comprehensive data fusion engine that categorizes various sensory data formats and appropriately merges them to enhance accuracy. Specifically, we suggest a general framework to combine audio, visual, and textual data, building upon our previous research on an innovative hybrid image fusion model that fused multispectral image data. However, this previous model faced challenges when fusing 3D point cloud data and handling large volumes of sensory data. To overcome these challenges, our study introduces a novel image fusion model called Image Fusion Generative Adversarial Network (IFGAN), which incorporates a multi-scale attention mechanism into both the generator and discriminator of a Generative Adversarial Network (GAN). The primary objective of image fusion is to merge complementary data from various perspectives of the same scene to enhance the clarity and detail of the final image. The multi-scale attention mechanism serves two purposes: the first, capturing comprehensive spatial information to enable the generator to focus on foreground and background target information in the sensory data, and the second, constraining the discriminator to concentrate on attention regions rather than the entire input image. Furthermore, the proposed model integrates the color information retention concept from the previously proposed image fusion model. Furthermore, we propose simple and efficient models for extracting salient image features. We evaluate the proposed models using various standard metrics and compare them with existing popular models. The results demonstrate that our proposed image fusion model outperforms the other models in terms of performance.","PeriodicalId":37584,"journal":{"name":"Journal of Sensor and Actuator Networks","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"IFGAN—A Novel Image Fusion Model to Fuse 3D Point Cloud Sensory Data\",\"authors\":\"H. Ignatious, Hesham El-Sayed, Salah Bouktif\",\"doi\":\"10.3390/jsan13010015\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To enhance the level of autonomy in driving, it is crucial to ensure optimal execution of critical maneuvers in all situations. However, numerous accidents involving autonomous vehicles (AVs) developed by major automobile manufacturers in recent years have been attributed to poor decision making caused by insufficient perception of environmental information. AVs employ diverse sensors in today’s technology-driven settings to gather this information. However, due to technical and natural factors, the data collected by these sensors may be incomplete or ambiguous, leading to misinterpretation by AVs and resulting in fatal accidents. Furthermore, environmental information obtained from multiple sources in the vehicular environment often exhibits multimodal characteristics. To address this limitation, effective preprocessing of raw sensory data becomes essential, involving two crucial tasks: data cleaning and data fusion. In this context, we propose a comprehensive data fusion engine that categorizes various sensory data formats and appropriately merges them to enhance accuracy. Specifically, we suggest a general framework to combine audio, visual, and textual data, building upon our previous research on an innovative hybrid image fusion model that fused multispectral image data. However, this previous model faced challenges when fusing 3D point cloud data and handling large volumes of sensory data. To overcome these challenges, our study introduces a novel image fusion model called Image Fusion Generative Adversarial Network (IFGAN), which incorporates a multi-scale attention mechanism into both the generator and discriminator of a Generative Adversarial Network (GAN). The primary objective of image fusion is to merge complementary data from various perspectives of the same scene to enhance the clarity and detail of the final image. The multi-scale attention mechanism serves two purposes: the first, capturing comprehensive spatial information to enable the generator to focus on foreground and background target information in the sensory data, and the second, constraining the discriminator to concentrate on attention regions rather than the entire input image. Furthermore, the proposed model integrates the color information retention concept from the previously proposed image fusion model. Furthermore, we propose simple and efficient models for extracting salient image features. We evaluate the proposed models using various standard metrics and compare them with existing popular models. The results demonstrate that our proposed image fusion model outperforms the other models in terms of performance.\",\"PeriodicalId\":37584,\"journal\":{\"name\":\"Journal of Sensor and Actuator Networks\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2024-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Sensor and Actuator Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/jsan13010015\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Sensor and Actuator Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/jsan13010015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

为了提高自动驾驶水平,确保在任何情况下都能以最佳方式执行关键操作至关重要。然而,近年来主要汽车制造商开发的自动驾驶汽车(AV)发生的多起事故,都归咎于对环境信息感知不足而导致的决策失误。在当今技术驱动的环境下,自动驾驶汽车采用各种传感器来收集这些信息。然而,由于技术和自然因素,这些传感器收集到的数据可能不完整或含糊不清,导致自动驾驶汽车误读,造成致命事故。此外,在车辆环境中从多个来源获得的环境信息往往具有多模态特征。为了解决这一局限性,对原始感知数据进行有效的预处理变得至关重要,其中涉及两项关键任务:数据清理和数据融合。在此背景下,我们提出了一种全面的数据融合引擎,可对各种感测数据格式进行分类,并适当合并以提高准确性。具体来说,我们提出了一个将音频、视觉和文本数据结合起来的通用框架,该框架建立在我们之前对融合多光谱图像数据的创新型混合图像融合模型的研究基础之上。然而,之前的模型在融合三维点云数据和处理大量感知数据时面临挑战。为了克服这些挑战,我们的研究引入了一种名为 "图像融合生成对抗网络(IFGAN)"的新型图像融合模型,该模型将多尺度关注机制融入生成对抗网络(GAN)的生成器和判别器中。图像融合的主要目的是合并来自同一场景不同视角的互补数据,以提高最终图像的清晰度和细节。多尺度注意力机制有两个目的:第一,捕捉全面的空间信息,使生成器能够关注感官数据中的前景和背景目标信息;第二,限制判别器将注意力集中在注意力区域,而不是整个输入图像。此外,所提出的模型还整合了之前提出的图像融合模型中的颜色信息保留概念。此外,我们还提出了用于提取突出图像特征的简单而高效的模型。我们使用各种标准指标对所提出的模型进行了评估,并与现有的流行模型进行了比较。结果表明,我们提出的图像融合模型在性能上优于其他模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
IFGAN—A Novel Image Fusion Model to Fuse 3D Point Cloud Sensory Data
To enhance the level of autonomy in driving, it is crucial to ensure optimal execution of critical maneuvers in all situations. However, numerous accidents involving autonomous vehicles (AVs) developed by major automobile manufacturers in recent years have been attributed to poor decision making caused by insufficient perception of environmental information. AVs employ diverse sensors in today’s technology-driven settings to gather this information. However, due to technical and natural factors, the data collected by these sensors may be incomplete or ambiguous, leading to misinterpretation by AVs and resulting in fatal accidents. Furthermore, environmental information obtained from multiple sources in the vehicular environment often exhibits multimodal characteristics. To address this limitation, effective preprocessing of raw sensory data becomes essential, involving two crucial tasks: data cleaning and data fusion. In this context, we propose a comprehensive data fusion engine that categorizes various sensory data formats and appropriately merges them to enhance accuracy. Specifically, we suggest a general framework to combine audio, visual, and textual data, building upon our previous research on an innovative hybrid image fusion model that fused multispectral image data. However, this previous model faced challenges when fusing 3D point cloud data and handling large volumes of sensory data. To overcome these challenges, our study introduces a novel image fusion model called Image Fusion Generative Adversarial Network (IFGAN), which incorporates a multi-scale attention mechanism into both the generator and discriminator of a Generative Adversarial Network (GAN). The primary objective of image fusion is to merge complementary data from various perspectives of the same scene to enhance the clarity and detail of the final image. The multi-scale attention mechanism serves two purposes: the first, capturing comprehensive spatial information to enable the generator to focus on foreground and background target information in the sensory data, and the second, constraining the discriminator to concentrate on attention regions rather than the entire input image. Furthermore, the proposed model integrates the color information retention concept from the previously proposed image fusion model. Furthermore, we propose simple and efficient models for extracting salient image features. We evaluate the proposed models using various standard metrics and compare them with existing popular models. The results demonstrate that our proposed image fusion model outperforms the other models in terms of performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Sensor and Actuator Networks
Journal of Sensor and Actuator Networks Physics and Astronomy-Instrumentation
CiteScore
7.90
自引率
2.90%
发文量
70
审稿时长
11 weeks
期刊介绍: Journal of Sensor and Actuator Networks (ISSN 2224-2708) is an international open access journal on the science and technology of sensor and actuator networks. It publishes regular research papers, reviews (including comprehensive reviews on complete sensor and actuator networks), and short communications. Our aim is to encourage scientists to publish their experimental and theoretical results in as much detail as possible. There is no restriction on the length of the papers. The full experimental details must be provided so that the results can be reproduced.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信