Weining Sun, Yuanhao Liu, Feng Wang, Le Hua, Jianzhong Fu, Songyu Hu
{"title":"A Study on Flame Detection Method Combining Visible Light and Thermal Infrared Multimodal Images","authors":"Weining Sun, Yuanhao Liu, Feng Wang, Le Hua, Jianzhong Fu, Songyu Hu","doi":"10.1007/s10694-024-01676-9","DOIUrl":null,"url":null,"abstract":"<div><p>Fire disasters pose a significant threat to human safety. Therefore, timely and effective fire detection is crucial for mitigating these threats. Combining visible light and thermal infrared for multimodal flame detection can fully utilize the visual and temperature distribution information of flames, potentially considerably enhancing the accuracy and robustness of flame detection methods. This approach is a highly promising detection method. However, the visible light and thermal infrared modalities differ fundamentally in imaging principles, pixel resolution, and texture information. Thus, effective fusion of these modalities becomes challenging. To address this issue, a novel flame detection method that integrates visible light and thermal infrared images is introduced. For the visible light modality, an overall model based on Mask R-CNN is designed, with ConvNeXt as the backbone, FPN as the neck, and a cascade structure as the detection head. Then, for the thermal infrared modality, to adapt to its weak semantic and strong texture features, we specifically modified the model’s neck to better extract the underlying texture information of the image using the PAFPN structure. Furthermore, we designed a multimodal fusion algorithm using GIoU to fuse detection information from the visible light and thermal infrared modalities to address the weak alignment of detection targets in imaging principles, pixel resolution, and texture information. Experimental results on both public and self-collected datasets demonstrate that our proposed method outperforms other mainstream target detection networks in flame detection. Moreover, ablation experiments suggest that multimodal fusion significantly improves the overall performance of the algorithm. Specifically, our method achieves an accuracy of 85.33, a recall of 99.33, and an F1 score of 90.03.</p></div>","PeriodicalId":558,"journal":{"name":"Fire Technology","volume":"61 4","pages":"2167 - 2188"},"PeriodicalIF":2.4000,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fire Technology","FirstCategoryId":"5","ListUrlMain":"https://link.springer.com/article/10.1007/s10694-024-01676-9","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
Fire disasters pose a significant threat to human safety. Therefore, timely and effective fire detection is crucial for mitigating these threats. Combining visible light and thermal infrared for multimodal flame detection can fully utilize the visual and temperature distribution information of flames, potentially considerably enhancing the accuracy and robustness of flame detection methods. This approach is a highly promising detection method. However, the visible light and thermal infrared modalities differ fundamentally in imaging principles, pixel resolution, and texture information. Thus, effective fusion of these modalities becomes challenging. To address this issue, a novel flame detection method that integrates visible light and thermal infrared images is introduced. For the visible light modality, an overall model based on Mask R-CNN is designed, with ConvNeXt as the backbone, FPN as the neck, and a cascade structure as the detection head. Then, for the thermal infrared modality, to adapt to its weak semantic and strong texture features, we specifically modified the model’s neck to better extract the underlying texture information of the image using the PAFPN structure. Furthermore, we designed a multimodal fusion algorithm using GIoU to fuse detection information from the visible light and thermal infrared modalities to address the weak alignment of detection targets in imaging principles, pixel resolution, and texture information. Experimental results on both public and self-collected datasets demonstrate that our proposed method outperforms other mainstream target detection networks in flame detection. Moreover, ablation experiments suggest that multimodal fusion significantly improves the overall performance of the algorithm. Specifically, our method achieves an accuracy of 85.33, a recall of 99.33, and an F1 score of 90.03.
期刊介绍:
Fire Technology publishes original contributions, both theoretical and empirical, that contribute to the solution of problems in fire safety science and engineering. It is the leading journal in the field, publishing applied research dealing with the full range of actual and potential fire hazards facing humans and the environment. It covers the entire domain of fire safety science and engineering problems relevant in industrial, operational, cultural, and environmental applications, including modeling, testing, detection, suppression, human behavior, wildfires, structures, and risk analysis.
The aim of Fire Technology is to push forward the frontiers of knowledge and technology by encouraging interdisciplinary communication of significant technical developments in fire protection and subjects of scientific interest to the fire protection community at large.
It is published in conjunction with the National Fire Protection Association (NFPA) and the Society of Fire Protection Engineers (SFPE). The mission of NFPA is to help save lives and reduce loss with information, knowledge, and passion. The mission of SFPE is advancing the science and practice of fire protection engineering internationally.