High-performance fire detection framework based on feature enhancement and multimodal fusion

IF 3.4 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH

安全科学与韧性(英文) Pub Date : 2025-06-26 DOI:10.1016/j.jnlssr.2025.03.004

Zekun Zhou , Hongyang Zhao , Xingdong Li , Yi Liu , Tao Jiang , Jing Jin , Yanan Guo

{"title":"High-performance fire detection framework based on feature enhancement and multimodal fusion","authors":"Zekun Zhou , Hongyang Zhao , Xingdong Li , Yi Liu , Tao Jiang , Jing Jin , Yanan Guo","doi":"10.1016/j.jnlssr.2025.03.004","DOIUrl":null,"url":null,"abstract":"<div><div>Fire detection technology has become increasingly critical in the context of rising global fire threats and extreme weather conditions. Traditional methods rely on single-modal sensors and conventional image processing techniques and often struggle with complex environmental variations and background interference. This study proposes an innovative multimodal fire detection framework that integrates advanced deep learning techniques to address these limitations. By leveraging a comprehensive approach that combines YOLOv8-based object detection, HSV color space enhancement, completed local binary pattern (CLBP) texture analysis, and a novel dynamic feature enhancement module (DFEM), the proposed method significantly improves fire detection accuracy and robustness. This research introduces a sophisticated multimodal fusion strategy that systematically processes fire-related features across multiple domains. A key innovation is the cross-modality fusion Mamba (CMFM) module, which employs efficient channel attention (ECA) and an efficient 2D-selective scan module (E2DSM) to dynamically integrate and refine features from different modalities. Experimental validation was conducted on a dataset that we collected, which was supplemented by data collected via real-world robotic image acquisition in diverse environments, including forests, corridors, and outdoor settings. The proposed method demonstrated exceptional performance, with a precision of 96.4%, a recall of 95.7%, and an overall accuracy of 95.8%, outperforming state-of-the-art models such as VGG16, ResNet50, YOLOv5, and YOLOv8. Ablation studies further validated the contribution of each module and highlighted the framework’s robust feature enhancement and fusion capabilities.</div></div>","PeriodicalId":62710,"journal":{"name":"安全科学与韧性(英文)","volume":"7 1","pages":"Article 100212"},"PeriodicalIF":3.4000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"安全科学与韧性(英文)","FirstCategoryId":"1087","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666449625000465","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}

引用次数: 0

Abstract

Fire detection technology has become increasingly critical in the context of rising global fire threats and extreme weather conditions. Traditional methods rely on single-modal sensors and conventional image processing techniques and often struggle with complex environmental variations and background interference. This study proposes an innovative multimodal fire detection framework that integrates advanced deep learning techniques to address these limitations. By leveraging a comprehensive approach that combines YOLOv8-based object detection, HSV color space enhancement, completed local binary pattern (CLBP) texture analysis, and a novel dynamic feature enhancement module (DFEM), the proposed method significantly improves fire detection accuracy and robustness. This research introduces a sophisticated multimodal fusion strategy that systematically processes fire-related features across multiple domains. A key innovation is the cross-modality fusion Mamba (CMFM) module, which employs efficient channel attention (ECA) and an efficient 2D-selective scan module (E2DSM) to dynamically integrate and refine features from different modalities. Experimental validation was conducted on a dataset that we collected, which was supplemented by data collected via real-world robotic image acquisition in diverse environments, including forests, corridors, and outdoor settings. The proposed method demonstrated exceptional performance, with a precision of 96.4%, a recall of 95.7%, and an overall accuracy of 95.8%, outperforming state-of-the-art models such as VGG16, ResNet50, YOLOv5, and YOLOv8. Ablation studies further validated the contribution of each module and highlighted the framework’s robust feature enhancement and fusion capabilities.

查看原文本刊更多论文

基于特征增强和多模态融合的高性能火灾探测框架

在全球火灾威胁和极端天气条件不断上升的背景下，火灾探测技术变得越来越重要。传统的方法依赖于单模态传感器和传统的图像处理技术，并且经常与复杂的环境变化和背景干扰作斗争。本研究提出了一种创新的多模态火灾探测框架，该框架集成了先进的深度学习技术来解决这些限制。该方法将基于yolov8的目标检测、HSV色彩空间增强、完整局部二值模式（CLBP）纹理分析和新型动态特征增强模块（DFEM）相结合，显著提高了火灾检测精度和鲁棒性。本研究引入了一种复杂的多模态融合策略，系统地处理跨多个领域的火灾相关特征。一个关键的创新是跨模态融合曼巴（CMFM）模块，它采用有效的通道注意（ECA）和有效的2d选择扫描模块（E2DSM）来动态集成和细化来自不同模态的特征。我们在收集的数据集上进行了实验验证，并辅以在不同环境（包括森林、走廊和室外环境）中通过真实机器人图像采集收集的数据。该方法的准确率为96.4%，召回率为95.7%，总体准确率为95.8%，优于VGG16、ResNet50、YOLOv5和YOLOv8等最先进的模型。消融研究进一步验证了每个模块的贡献，并强调了框架强大的特征增强和融合能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊