Yitong Fu , Haiyan Li , Yujiao Wang , Wenbing Lv , Bingbing He , Pengfei Yu
{"title":"EGNL-FAT:一种带频率感知变压器的边缘引导非局部网络烟雾分割方法","authors":"Yitong Fu , Haiyan Li , Yujiao Wang , Wenbing Lv , Bingbing He , Pengfei Yu","doi":"10.1016/j.eswa.2025.127621","DOIUrl":null,"url":null,"abstract":"<div><div>Smoke semantic segmentation presents enormous challenges compared to other objects due to the special characteristics of smoke, such as its non-rigid structure, translucency, and variable modes. Therefore, we design an Edge-Guided Non-Local Network with Frequency-Aware Transformer (EGNL-FAT) for smoke segmentation in real-world scenarios. Unlike traditional smoke segmentation methods, this paper first considers the different feature extraction capabilities of CNNs and transformers due to their varying receptive fields. By constructing a non-local network, the model’s ability to establish long-range dependencies is enhanced. Secondly, edge information is utilized to alleviate the interference caused by intraclass inconsistency and interclass similarity during segmentation. To address the scarcity of real-scene datasets, we introduce the Forest-Scene Smoke Segmentation (FSS) dataset, which contains diverse smoke types and complex backgrounds. We also propose a frequency-aware transformer combining Fast Fourier Transform (FFT) and frequency domain decoupling to reduce computational complexity. A Cross-Domain Fusion Module (CDFM) is introduced for collaborative learning of information from various sources using weighted fusion and Coordinate Attention (CA). Additionally, we develop an Edge Feature Extraction Module (EFEM) that efficiently extracts detailed edge information at full resolution using a shallow structure. Finally, a Multi-Directional Cross Attention (MDCA) mechanism is proposed to compute similarities between edge and decoder feature maps, guiding accurate segmentation. Experimental results show that EGNL-FAT achieves mean Intersection over Unions (mIoUs) of 59.66 % (FS01) and 64.68 % (FS02) on FSS, and 79.04 % on SMOKE5K. It demonstrates excellent performance while maintaining satisfactory model complexity and processing efficiency. Our code and dataset are publicly available at: <span><span>https://github.com/yitccc/smoke-segmentation/tree/master</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"280 ","pages":"Article 127621"},"PeriodicalIF":7.5000,"publicationDate":"2025-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EGNL-FAT: An Edge-Guided Non-Local network with Frequency-Aware transformer for smoke segmentation\",\"authors\":\"Yitong Fu , Haiyan Li , Yujiao Wang , Wenbing Lv , Bingbing He , Pengfei Yu\",\"doi\":\"10.1016/j.eswa.2025.127621\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Smoke semantic segmentation presents enormous challenges compared to other objects due to the special characteristics of smoke, such as its non-rigid structure, translucency, and variable modes. Therefore, we design an Edge-Guided Non-Local Network with Frequency-Aware Transformer (EGNL-FAT) for smoke segmentation in real-world scenarios. Unlike traditional smoke segmentation methods, this paper first considers the different feature extraction capabilities of CNNs and transformers due to their varying receptive fields. By constructing a non-local network, the model’s ability to establish long-range dependencies is enhanced. Secondly, edge information is utilized to alleviate the interference caused by intraclass inconsistency and interclass similarity during segmentation. To address the scarcity of real-scene datasets, we introduce the Forest-Scene Smoke Segmentation (FSS) dataset, which contains diverse smoke types and complex backgrounds. We also propose a frequency-aware transformer combining Fast Fourier Transform (FFT) and frequency domain decoupling to reduce computational complexity. A Cross-Domain Fusion Module (CDFM) is introduced for collaborative learning of information from various sources using weighted fusion and Coordinate Attention (CA). Additionally, we develop an Edge Feature Extraction Module (EFEM) that efficiently extracts detailed edge information at full resolution using a shallow structure. Finally, a Multi-Directional Cross Attention (MDCA) mechanism is proposed to compute similarities between edge and decoder feature maps, guiding accurate segmentation. Experimental results show that EGNL-FAT achieves mean Intersection over Unions (mIoUs) of 59.66 % (FS01) and 64.68 % (FS02) on FSS, and 79.04 % on SMOKE5K. It demonstrates excellent performance while maintaining satisfactory model complexity and processing efficiency. Our code and dataset are publicly available at: <span><span>https://github.com/yitccc/smoke-segmentation/tree/master</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"280 \",\"pages\":\"Article 127621\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425012436\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425012436","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
EGNL-FAT: An Edge-Guided Non-Local network with Frequency-Aware transformer for smoke segmentation
Smoke semantic segmentation presents enormous challenges compared to other objects due to the special characteristics of smoke, such as its non-rigid structure, translucency, and variable modes. Therefore, we design an Edge-Guided Non-Local Network with Frequency-Aware Transformer (EGNL-FAT) for smoke segmentation in real-world scenarios. Unlike traditional smoke segmentation methods, this paper first considers the different feature extraction capabilities of CNNs and transformers due to their varying receptive fields. By constructing a non-local network, the model’s ability to establish long-range dependencies is enhanced. Secondly, edge information is utilized to alleviate the interference caused by intraclass inconsistency and interclass similarity during segmentation. To address the scarcity of real-scene datasets, we introduce the Forest-Scene Smoke Segmentation (FSS) dataset, which contains diverse smoke types and complex backgrounds. We also propose a frequency-aware transformer combining Fast Fourier Transform (FFT) and frequency domain decoupling to reduce computational complexity. A Cross-Domain Fusion Module (CDFM) is introduced for collaborative learning of information from various sources using weighted fusion and Coordinate Attention (CA). Additionally, we develop an Edge Feature Extraction Module (EFEM) that efficiently extracts detailed edge information at full resolution using a shallow structure. Finally, a Multi-Directional Cross Attention (MDCA) mechanism is proposed to compute similarities between edge and decoder feature maps, guiding accurate segmentation. Experimental results show that EGNL-FAT achieves mean Intersection over Unions (mIoUs) of 59.66 % (FS01) and 64.68 % (FS02) on FSS, and 79.04 % on SMOKE5K. It demonstrates excellent performance while maintaining satisfactory model complexity and processing efficiency. Our code and dataset are publicly available at: https://github.com/yitccc/smoke-segmentation/tree/master.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.