有效光谱超分辨的广义像素感知深度函数混合网络

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge-Based Systems Pub Date : 2025-05-27 DOI:10.1016/j.knosys.2025.113743

Jiangtao Nie , Lei Zhang , Chongxing Song , Zhiqiang Lang , Weixin Ren , Wei Wei , Chen Ding , Yanning Zhang

{"title":"有效光谱超分辨的广义像素感知深度函数混合网络","authors":"Jiangtao Nie , Lei Zhang , Chongxing Song , Zhiqiang Lang , Weixin Ren , Wei Wei , Chen Ding , Yanning Zhang","doi":"10.1016/j.knosys.2025.113743","DOIUrl":null,"url":null,"abstract":"<div><div>Recent progress on spectral super-resolution (SR) mainly focuses on directly mapping an RGB image to its HSI counterpart using deep convolutional neural networks, <em>i.e.,</em> non-linearly transform the RGB context within a size-fixed receptive field centered at each pixel to its spectrum using a universal deep mapping function. However, in real scenarios, pixels in HSIs inevitably require size-different receptive fields and distinct mapping functions due to their differences in object category or spatial position, and consequently, these existing methods show limited generalization capacity, especially when the imaging scene is complicated. To tackle this issue, we introduce a pixel-aware deep function-mixture network (PADFMN) for SSR, which consists of a novel class of modules called function-mixture (FM) blocks. Each FM block contains several basis functions, represented by parallel subnets with varying receptive field sizes. Additionally, a separate subnet functions as a mixing function, generating pixel-level weights that linearly combine the outputs of the basis functions. This approach allows the network to dynamically adjust the receptive field size and mapping function for each pixel based on its specific characteristics. Through stacking several such FM blocks together and fusing their intermediate feature representations, we can obtain an effective SSR network with flexibility in learning pixel-wise deep mapping functions as well as better generalization capacity. Moreover, with the aim of employing the proposed PADFMN to cope with two more challenging SSR tasks, including cross-sensor SSR (<em>i.e.,</em> test on RGB image shot by a new sensor with unseen spectral response function) and scale-arbitrary SSR (<em>i.e.,</em> the spectral resolution of HSI to reconstruct can be arbitrarily determined), we extend the core FM blocks to two more generalized versions, namely sensor-guided FM block and scale-guided FM block. The former is able to cast the sensor-related information (<em>e.g.,</em> spectral response function) into guidance via dynamic filters to assist the spectral reconstruction using the basic FM block. This is beneficial for reducing the distribution shift between the training and test images incurred by unseen RGB sensors in terms of establishing the deep mapping function, thus leading to pleasing performance in cross-sensor SSR tasks. On the other hand, the latter encodes the user-determined spectral resolution to control the channel dimension of the feature output by the last basic FM block precisely via dynamically generating corresponding convolution filters, so that the network can reconstruct HSI with an arbitrarily determined scale while keeping the spectrum accuracy. We test the proposed method on three benchmark datasets, and it achieves state-of-the-art performance in SSR, cross-sensor SSR, and scale-arbitrary SSR tasks.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"323 ","pages":"Article 113743"},"PeriodicalIF":7.2000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generalized pixel-aware deep function-mixture network for effective spectral super-resolution\",\"authors\":\"Jiangtao Nie , Lei Zhang , Chongxing Song , Zhiqiang Lang , Weixin Ren , Wei Wei , Chen Ding , Yanning Zhang\",\"doi\":\"10.1016/j.knosys.2025.113743\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recent progress on spectral super-resolution (SR) mainly focuses on directly mapping an RGB image to its HSI counterpart using deep convolutional neural networks, <em>i.e.,</em> non-linearly transform the RGB context within a size-fixed receptive field centered at each pixel to its spectrum using a universal deep mapping function. However, in real scenarios, pixels in HSIs inevitably require size-different receptive fields and distinct mapping functions due to their differences in object category or spatial position, and consequently, these existing methods show limited generalization capacity, especially when the imaging scene is complicated. To tackle this issue, we introduce a pixel-aware deep function-mixture network (PADFMN) for SSR, which consists of a novel class of modules called function-mixture (FM) blocks. Each FM block contains several basis functions, represented by parallel subnets with varying receptive field sizes. Additionally, a separate subnet functions as a mixing function, generating pixel-level weights that linearly combine the outputs of the basis functions. This approach allows the network to dynamically adjust the receptive field size and mapping function for each pixel based on its specific characteristics. Through stacking several such FM blocks together and fusing their intermediate feature representations, we can obtain an effective SSR network with flexibility in learning pixel-wise deep mapping functions as well as better generalization capacity. Moreover, with the aim of employing the proposed PADFMN to cope with two more challenging SSR tasks, including cross-sensor SSR (<em>i.e.,</em> test on RGB image shot by a new sensor with unseen spectral response function) and scale-arbitrary SSR (<em>i.e.,</em> the spectral resolution of HSI to reconstruct can be arbitrarily determined), we extend the core FM blocks to two more generalized versions, namely sensor-guided FM block and scale-guided FM block. The former is able to cast the sensor-related information (<em>e.g.,</em> spectral response function) into guidance via dynamic filters to assist the spectral reconstruction using the basic FM block. This is beneficial for reducing the distribution shift between the training and test images incurred by unseen RGB sensors in terms of establishing the deep mapping function, thus leading to pleasing performance in cross-sensor SSR tasks. On the other hand, the latter encodes the user-determined spectral resolution to control the channel dimension of the feature output by the last basic FM block precisely via dynamically generating corresponding convolution filters, so that the network can reconstruct HSI with an arbitrarily determined scale while keeping the spectrum accuracy. We test the proposed method on three benchmark datasets, and it achieves state-of-the-art performance in SSR, cross-sensor SSR, and scale-arbitrary SSR tasks.</div></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"323 \",\"pages\":\"Article 113743\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2025-05-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950705125007890\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125007890","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

光谱超分辨率（SR）的最新进展主要集中在使用深度卷积神经网络直接将RGB图像映射到其HSI对应图像，即使用通用深度映射函数将RGB上下文非线性地转换为以每个像素为中心的固定大小的接受域内的RGB上下文到其光谱。然而，在实际场景中，由于对象类别或空间位置的差异，hsi中的像素不可避免地需要不同大小的接受域和不同的映射函数，因此，这些现有方法的泛化能力有限，特别是当成像场景复杂时。为了解决这个问题，我们为SSR引入了一个像素感知的深度功能混合网络（PADFMN），该网络由一类称为功能混合（FM）块的新型模块组成。每个调频块包含几个基函数，由具有不同接收域大小的并行子网表示。此外，一个单独的子网作为混合函数，生成线性组合基函数输出的像素级权重。这种方法允许网络根据每个像素的特定特征动态调整接受野大小和映射函数。通过将多个这样的调频块叠加在一起，并融合它们的中间特征表示，我们可以得到一个有效的SSR网络，该网络具有灵活的学习逐像素深度映射函数和更好的泛化能力。此外，为了利用所提出的PADFMN来应对两个更具挑战性的SSR任务，即跨传感器SSR（即对未知光谱响应函数的新传感器拍摄的RGB图像进行测试）和任意尺度SSR（即可以任意确定重建的HSI的光谱分辨率），我们将核心FM块扩展为两个更广义的版本，即传感器引导的FM块和尺度引导的FM块。前者能够通过动态滤波器将传感器相关信息（如频谱响应函数）投射到制导中，以辅助使用基本调频块进行频谱重建。这有利于减少未知RGB传感器在建立深度映射函数时产生的训练图像与测试图像之间的分布偏移，从而在跨传感器SSR任务中获得令人满意的性能。另一方面，后者对用户确定的频谱分辨率进行编码，通过动态生成相应的卷积滤波器，精确控制最后一个基本调频块特征输出的信道维数，使网络可以在保持频谱精度的情况下，以任意确定的尺度重构HSI。我们在三个基准数据集上对该方法进行了测试，结果表明该方法在SSR、跨传感器SSR和任意尺度SSR任务中都达到了最先进的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Generalized pixel-aware deep function-mixture network for effective spectral super-resolution

Recent progress on spectral super-resolution (SR) mainly focuses on directly mapping an RGB image to its HSI counterpart using deep convolutional neural networks, i.e., non-linearly transform the RGB context within a size-fixed receptive field centered at each pixel to its spectrum using a universal deep mapping function. However, in real scenarios, pixels in HSIs inevitably require size-different receptive fields and distinct mapping functions due to their differences in object category or spatial position, and consequently, these existing methods show limited generalization capacity, especially when the imaging scene is complicated. To tackle this issue, we introduce a pixel-aware deep function-mixture network (PADFMN) for SSR, which consists of a novel class of modules called function-mixture (FM) blocks. Each FM block contains several basis functions, represented by parallel subnets with varying receptive field sizes. Additionally, a separate subnet functions as a mixing function, generating pixel-level weights that linearly combine the outputs of the basis functions. This approach allows the network to dynamically adjust the receptive field size and mapping function for each pixel based on its specific characteristics. Through stacking several such FM blocks together and fusing their intermediate feature representations, we can obtain an effective SSR network with flexibility in learning pixel-wise deep mapping functions as well as better generalization capacity. Moreover, with the aim of employing the proposed PADFMN to cope with two more challenging SSR tasks, including cross-sensor SSR (i.e., test on RGB image shot by a new sensor with unseen spectral response function) and scale-arbitrary SSR (i.e., the spectral resolution of HSI to reconstruct can be arbitrarily determined), we extend the core FM blocks to two more generalized versions, namely sensor-guided FM block and scale-guided FM block. The former is able to cast the sensor-related information (e.g., spectral response function) into guidance via dynamic filters to assist the spectral reconstruction using the basic FM block. This is beneficial for reducing the distribution shift between the training and test images incurred by unseen RGB sensors in terms of establishing the deep mapping function, thus leading to pleasing performance in cross-sensor SSR tasks. On the other hand, the latter encodes the user-determined spectral resolution to control the channel dimension of the feature output by the last basic FM block precisely via dynamically generating corresponding convolution filters, so that the network can reconstruct HSI with an arbitrarily determined scale while keeping the spectrum accuracy. We test the proposed method on three benchmark datasets, and it achieves state-of-the-art performance in SSR, cross-sensor SSR, and scale-arbitrary SSR tasks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.