Zekun Zhou , Hongyang Zhao , Xingdong Li , Yi Liu , Tao Jiang , Jing Jin , Yanan Guo
{"title":"基于特征增强和多模态融合的高性能火灾探测框架","authors":"Zekun Zhou , Hongyang Zhao , Xingdong Li , Yi Liu , Tao Jiang , Jing Jin , Yanan Guo","doi":"10.1016/j.jnlssr.2025.03.004","DOIUrl":null,"url":null,"abstract":"<div><div>Fire detection technology has become increasingly critical in the context of rising global fire threats and extreme weather conditions. Traditional methods rely on single-modal sensors and conventional image processing techniques and often struggle with complex environmental variations and background interference. This study proposes an innovative multimodal fire detection framework that integrates advanced deep learning techniques to address these limitations. By leveraging a comprehensive approach that combines YOLOv8-based object detection, HSV color space enhancement, completed local binary pattern (CLBP) texture analysis, and a novel dynamic feature enhancement module (DFEM), the proposed method significantly improves fire detection accuracy and robustness. This research introduces a sophisticated multimodal fusion strategy that systematically processes fire-related features across multiple domains. A key innovation is the cross-modality fusion Mamba (CMFM) module, which employs efficient channel attention (ECA) and an efficient 2D-selective scan module (E2DSM) to dynamically integrate and refine features from different modalities. Experimental validation was conducted on a dataset that we collected, which was supplemented by data collected via real-world robotic image acquisition in diverse environments, including forests, corridors, and outdoor settings. The proposed method demonstrated exceptional performance, with a precision of 96.4%, a recall of 95.7%, and an overall accuracy of 95.8%, outperforming state-of-the-art models such as VGG16, ResNet50, YOLOv5, and YOLOv8. Ablation studies further validated the contribution of each module and highlighted the framework’s robust feature enhancement and fusion capabilities.</div></div>","PeriodicalId":62710,"journal":{"name":"安全科学与韧性(英文)","volume":"7 1","pages":"Article 100212"},"PeriodicalIF":3.4000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"High-performance fire detection framework based on feature enhancement and multimodal fusion\",\"authors\":\"Zekun Zhou , Hongyang Zhao , Xingdong Li , Yi Liu , Tao Jiang , Jing Jin , Yanan Guo\",\"doi\":\"10.1016/j.jnlssr.2025.03.004\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Fire detection technology has become increasingly critical in the context of rising global fire threats and extreme weather conditions. Traditional methods rely on single-modal sensors and conventional image processing techniques and often struggle with complex environmental variations and background interference. This study proposes an innovative multimodal fire detection framework that integrates advanced deep learning techniques to address these limitations. By leveraging a comprehensive approach that combines YOLOv8-based object detection, HSV color space enhancement, completed local binary pattern (CLBP) texture analysis, and a novel dynamic feature enhancement module (DFEM), the proposed method significantly improves fire detection accuracy and robustness. This research introduces a sophisticated multimodal fusion strategy that systematically processes fire-related features across multiple domains. A key innovation is the cross-modality fusion Mamba (CMFM) module, which employs efficient channel attention (ECA) and an efficient 2D-selective scan module (E2DSM) to dynamically integrate and refine features from different modalities. Experimental validation was conducted on a dataset that we collected, which was supplemented by data collected via real-world robotic image acquisition in diverse environments, including forests, corridors, and outdoor settings. The proposed method demonstrated exceptional performance, with a precision of 96.4%, a recall of 95.7%, and an overall accuracy of 95.8%, outperforming state-of-the-art models such as VGG16, ResNet50, YOLOv5, and YOLOv8. Ablation studies further validated the contribution of each module and highlighted the framework’s robust feature enhancement and fusion capabilities.</div></div>\",\"PeriodicalId\":62710,\"journal\":{\"name\":\"安全科学与韧性(英文)\",\"volume\":\"7 1\",\"pages\":\"Article 100212\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-06-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"安全科学与韧性(英文)\",\"FirstCategoryId\":\"1087\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666449625000465\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"安全科学与韧性(英文)","FirstCategoryId":"1087","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666449625000465","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
High-performance fire detection framework based on feature enhancement and multimodal fusion
Fire detection technology has become increasingly critical in the context of rising global fire threats and extreme weather conditions. Traditional methods rely on single-modal sensors and conventional image processing techniques and often struggle with complex environmental variations and background interference. This study proposes an innovative multimodal fire detection framework that integrates advanced deep learning techniques to address these limitations. By leveraging a comprehensive approach that combines YOLOv8-based object detection, HSV color space enhancement, completed local binary pattern (CLBP) texture analysis, and a novel dynamic feature enhancement module (DFEM), the proposed method significantly improves fire detection accuracy and robustness. This research introduces a sophisticated multimodal fusion strategy that systematically processes fire-related features across multiple domains. A key innovation is the cross-modality fusion Mamba (CMFM) module, which employs efficient channel attention (ECA) and an efficient 2D-selective scan module (E2DSM) to dynamically integrate and refine features from different modalities. Experimental validation was conducted on a dataset that we collected, which was supplemented by data collected via real-world robotic image acquisition in diverse environments, including forests, corridors, and outdoor settings. The proposed method demonstrated exceptional performance, with a precision of 96.4%, a recall of 95.7%, and an overall accuracy of 95.8%, outperforming state-of-the-art models such as VGG16, ResNet50, YOLOv5, and YOLOv8. Ablation studies further validated the contribution of each module and highlighted the framework’s robust feature enhancement and fusion capabilities.