用于烟雾语义分割的牛顿插值网络

IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Feiniu Yuan , Guiqian Wang , Qinghua Huang , Xuelong Li
{"title":"用于烟雾语义分割的牛顿插值网络","authors":"Feiniu Yuan ,&nbsp;Guiqian Wang ,&nbsp;Qinghua Huang ,&nbsp;Xuelong Li","doi":"10.1016/j.patcog.2024.111119","DOIUrl":null,"url":null,"abstract":"<div><div>Smoke has large variances of visual appearances that are very adverse to visual segmentation. Furthermore, its semi-transparency often produces highly complicated mixtures of smoke and backgrounds. These factors lead to great difficulties in labelling and segmenting smoke regions. To improve accuracy of smoke segmentation, we propose a Newton Interpolation Network (NINet) for visual smoke semantic segmentation. Unlike simply concatenating or point-wisely adding multi-scale encoded feature maps for information fusion or re-usage, we design a Newton Interpolation Module (NIM) to extract structured information by analyzing the feature values in the same position but from encoded feature maps with different scales. Interpolated features by our NIM contain long-range dependency and semantic structures across different levels, but traditional fusion of multi-scale feature maps cannot model intrinsic structures embedded in these maps. To obtain multi-scale structured information, we repeatedly use the proposed NIM at different levels of the decoding stages. In addition, we use more encoded feature maps to construct a higher order Newton interpolation polynomial for extracting higher order information. Extensive experiments validate that our method significantly outperforms existing state-of-the-art algorithms on virtual and real smoke datasets, and ablation experiments also validate the effectiveness of our NIMs.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111119"},"PeriodicalIF":7.5000,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A newton interpolation network for smoke semantic segmentation\",\"authors\":\"Feiniu Yuan ,&nbsp;Guiqian Wang ,&nbsp;Qinghua Huang ,&nbsp;Xuelong Li\",\"doi\":\"10.1016/j.patcog.2024.111119\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Smoke has large variances of visual appearances that are very adverse to visual segmentation. Furthermore, its semi-transparency often produces highly complicated mixtures of smoke and backgrounds. These factors lead to great difficulties in labelling and segmenting smoke regions. To improve accuracy of smoke segmentation, we propose a Newton Interpolation Network (NINet) for visual smoke semantic segmentation. Unlike simply concatenating or point-wisely adding multi-scale encoded feature maps for information fusion or re-usage, we design a Newton Interpolation Module (NIM) to extract structured information by analyzing the feature values in the same position but from encoded feature maps with different scales. Interpolated features by our NIM contain long-range dependency and semantic structures across different levels, but traditional fusion of multi-scale feature maps cannot model intrinsic structures embedded in these maps. To obtain multi-scale structured information, we repeatedly use the proposed NIM at different levels of the decoding stages. In addition, we use more encoded feature maps to construct a higher order Newton interpolation polynomial for extracting higher order information. Extensive experiments validate that our method significantly outperforms existing state-of-the-art algorithms on virtual and real smoke datasets, and ablation experiments also validate the effectiveness of our NIMs.</div></div>\",\"PeriodicalId\":49713,\"journal\":{\"name\":\"Pattern Recognition\",\"volume\":\"159 \",\"pages\":\"Article 111119\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2024-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Pattern Recognition\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0031320324008707\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0031320324008707","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

烟雾的视觉外观差异很大,非常不利于视觉分割。此外,烟雾的半透明性往往会产生非常复杂的烟雾和背景混合物。这些因素给烟雾区域的标记和分割带来了很大困难。为了提高烟雾分割的准确性,我们提出了一种用于视觉烟雾语义分割的牛顿插值网络(NINet)。不同于简单地将多尺度编码特征图进行连接或点向添加以实现信息融合或重复使用,我们设计了牛顿插值模块(NIM),通过分析同一位置但来自不同尺度编码特征图的特征值来提取结构化信息。我们的牛顿插值模块所插值的特征包含跨不同层次的长距离依赖关系和语义结构,但传统的多尺度特征图融合无法模拟这些特征图中蕴含的内在结构。为了获得多尺度结构信息,我们在解码阶段的不同层次反复使用了所提出的 NIM。此外,我们使用更多的编码特征图来构建高阶牛顿插值多项式,以提取更高阶的信息。大量实验证明,在虚拟和真实烟雾数据集上,我们的方法明显优于现有的最先进算法,而消融实验也验证了我们的 NIMs 的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A newton interpolation network for smoke semantic segmentation
Smoke has large variances of visual appearances that are very adverse to visual segmentation. Furthermore, its semi-transparency often produces highly complicated mixtures of smoke and backgrounds. These factors lead to great difficulties in labelling and segmenting smoke regions. To improve accuracy of smoke segmentation, we propose a Newton Interpolation Network (NINet) for visual smoke semantic segmentation. Unlike simply concatenating or point-wisely adding multi-scale encoded feature maps for information fusion or re-usage, we design a Newton Interpolation Module (NIM) to extract structured information by analyzing the feature values in the same position but from encoded feature maps with different scales. Interpolated features by our NIM contain long-range dependency and semantic structures across different levels, but traditional fusion of multi-scale feature maps cannot model intrinsic structures embedded in these maps. To obtain multi-scale structured information, we repeatedly use the proposed NIM at different levels of the decoding stages. In addition, we use more encoded feature maps to construct a higher order Newton interpolation polynomial for extracting higher order information. Extensive experiments validate that our method significantly outperforms existing state-of-the-art algorithms on virtual and real smoke datasets, and ablation experiments also validate the effectiveness of our NIMs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Pattern Recognition
Pattern Recognition 工程技术-工程:电子与电气
CiteScore
14.40
自引率
16.20%
发文量
683
审稿时长
5.6 months
期刊介绍: The field of Pattern Recognition is both mature and rapidly evolving, playing a crucial role in various related fields such as computer vision, image processing, text analysis, and neural networks. It closely intersects with machine learning and is being applied in emerging areas like biometrics, bioinformatics, multimedia data analysis, and data science. The journal Pattern Recognition, established half a century ago during the early days of computer science, has since grown significantly in scope and influence.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信