Jiangtao Nie , Lei Zhang , Chongxing Song , Zhiqiang Lang , Weixin Ren , Wei Wei , Chen Ding , Yanning Zhang
{"title":"有效光谱超分辨的广义像素感知深度函数混合网络","authors":"Jiangtao Nie , Lei Zhang , Chongxing Song , Zhiqiang Lang , Weixin Ren , Wei Wei , Chen Ding , Yanning Zhang","doi":"10.1016/j.knosys.2025.113743","DOIUrl":null,"url":null,"abstract":"<div><div>Recent progress on spectral super-resolution (SR) mainly focuses on directly mapping an RGB image to its HSI counterpart using deep convolutional neural networks, <em>i.e.,</em> non-linearly transform the RGB context within a size-fixed receptive field centered at each pixel to its spectrum using a universal deep mapping function. However, in real scenarios, pixels in HSIs inevitably require size-different receptive fields and distinct mapping functions due to their differences in object category or spatial position, and consequently, these existing methods show limited generalization capacity, especially when the imaging scene is complicated. To tackle this issue, we introduce a pixel-aware deep function-mixture network (PADFMN) for SSR, which consists of a novel class of modules called function-mixture (FM) blocks. Each FM block contains several basis functions, represented by parallel subnets with varying receptive field sizes. Additionally, a separate subnet functions as a mixing function, generating pixel-level weights that linearly combine the outputs of the basis functions. This approach allows the network to dynamically adjust the receptive field size and mapping function for each pixel based on its specific characteristics. Through stacking several such FM blocks together and fusing their intermediate feature representations, we can obtain an effective SSR network with flexibility in learning pixel-wise deep mapping functions as well as better generalization capacity. Moreover, with the aim of employing the proposed PADFMN to cope with two more challenging SSR tasks, including cross-sensor SSR (<em>i.e.,</em> test on RGB image shot by a new sensor with unseen spectral response function) and scale-arbitrary SSR (<em>i.e.,</em> the spectral resolution of HSI to reconstruct can be arbitrarily determined), we extend the core FM blocks to two more generalized versions, namely sensor-guided FM block and scale-guided FM block. The former is able to cast the sensor-related information (<em>e.g.,</em> spectral response function) into guidance via dynamic filters to assist the spectral reconstruction using the basic FM block. This is beneficial for reducing the distribution shift between the training and test images incurred by unseen RGB sensors in terms of establishing the deep mapping function, thus leading to pleasing performance in cross-sensor SSR tasks. On the other hand, the latter encodes the user-determined spectral resolution to control the channel dimension of the feature output by the last basic FM block precisely via dynamically generating corresponding convolution filters, so that the network can reconstruct HSI with an arbitrarily determined scale while keeping the spectrum accuracy. We test the proposed method on three benchmark datasets, and it achieves state-of-the-art performance in SSR, cross-sensor SSR, and scale-arbitrary SSR tasks.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"323 ","pages":"Article 113743"},"PeriodicalIF":7.2000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Generalized pixel-aware deep function-mixture network for effective spectral super-resolution\",\"authors\":\"Jiangtao Nie , Lei Zhang , Chongxing Song , Zhiqiang Lang , Weixin Ren , Wei Wei , Chen Ding , Yanning Zhang\",\"doi\":\"10.1016/j.knosys.2025.113743\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Recent progress on spectral super-resolution (SR) mainly focuses on directly mapping an RGB image to its HSI counterpart using deep convolutional neural networks, <em>i.e.,</em> non-linearly transform the RGB context within a size-fixed receptive field centered at each pixel to its spectrum using a universal deep mapping function. However, in real scenarios, pixels in HSIs inevitably require size-different receptive fields and distinct mapping functions due to their differences in object category or spatial position, and consequently, these existing methods show limited generalization capacity, especially when the imaging scene is complicated. To tackle this issue, we introduce a pixel-aware deep function-mixture network (PADFMN) for SSR, which consists of a novel class of modules called function-mixture (FM) blocks. Each FM block contains several basis functions, represented by parallel subnets with varying receptive field sizes. Additionally, a separate subnet functions as a mixing function, generating pixel-level weights that linearly combine the outputs of the basis functions. This approach allows the network to dynamically adjust the receptive field size and mapping function for each pixel based on its specific characteristics. Through stacking several such FM blocks together and fusing their intermediate feature representations, we can obtain an effective SSR network with flexibility in learning pixel-wise deep mapping functions as well as better generalization capacity. Moreover, with the aim of employing the proposed PADFMN to cope with two more challenging SSR tasks, including cross-sensor SSR (<em>i.e.,</em> test on RGB image shot by a new sensor with unseen spectral response function) and scale-arbitrary SSR (<em>i.e.,</em> the spectral resolution of HSI to reconstruct can be arbitrarily determined), we extend the core FM blocks to two more generalized versions, namely sensor-guided FM block and scale-guided FM block. The former is able to cast the sensor-related information (<em>e.g.,</em> spectral response function) into guidance via dynamic filters to assist the spectral reconstruction using the basic FM block. This is beneficial for reducing the distribution shift between the training and test images incurred by unseen RGB sensors in terms of establishing the deep mapping function, thus leading to pleasing performance in cross-sensor SSR tasks. On the other hand, the latter encodes the user-determined spectral resolution to control the channel dimension of the feature output by the last basic FM block precisely via dynamically generating corresponding convolution filters, so that the network can reconstruct HSI with an arbitrarily determined scale while keeping the spectrum accuracy. We test the proposed method on three benchmark datasets, and it achieves state-of-the-art performance in SSR, cross-sensor SSR, and scale-arbitrary SSR tasks.</div></div>\",\"PeriodicalId\":49939,\"journal\":{\"name\":\"Knowledge-Based Systems\",\"volume\":\"323 \",\"pages\":\"Article 113743\"},\"PeriodicalIF\":7.2000,\"publicationDate\":\"2025-05-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Knowledge-Based Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0950705125007890\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125007890","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Generalized pixel-aware deep function-mixture network for effective spectral super-resolution
Recent progress on spectral super-resolution (SR) mainly focuses on directly mapping an RGB image to its HSI counterpart using deep convolutional neural networks, i.e., non-linearly transform the RGB context within a size-fixed receptive field centered at each pixel to its spectrum using a universal deep mapping function. However, in real scenarios, pixels in HSIs inevitably require size-different receptive fields and distinct mapping functions due to their differences in object category or spatial position, and consequently, these existing methods show limited generalization capacity, especially when the imaging scene is complicated. To tackle this issue, we introduce a pixel-aware deep function-mixture network (PADFMN) for SSR, which consists of a novel class of modules called function-mixture (FM) blocks. Each FM block contains several basis functions, represented by parallel subnets with varying receptive field sizes. Additionally, a separate subnet functions as a mixing function, generating pixel-level weights that linearly combine the outputs of the basis functions. This approach allows the network to dynamically adjust the receptive field size and mapping function for each pixel based on its specific characteristics. Through stacking several such FM blocks together and fusing their intermediate feature representations, we can obtain an effective SSR network with flexibility in learning pixel-wise deep mapping functions as well as better generalization capacity. Moreover, with the aim of employing the proposed PADFMN to cope with two more challenging SSR tasks, including cross-sensor SSR (i.e., test on RGB image shot by a new sensor with unseen spectral response function) and scale-arbitrary SSR (i.e., the spectral resolution of HSI to reconstruct can be arbitrarily determined), we extend the core FM blocks to two more generalized versions, namely sensor-guided FM block and scale-guided FM block. The former is able to cast the sensor-related information (e.g., spectral response function) into guidance via dynamic filters to assist the spectral reconstruction using the basic FM block. This is beneficial for reducing the distribution shift between the training and test images incurred by unseen RGB sensors in terms of establishing the deep mapping function, thus leading to pleasing performance in cross-sensor SSR tasks. On the other hand, the latter encodes the user-determined spectral resolution to control the channel dimension of the feature output by the last basic FM block precisely via dynamically generating corresponding convolution filters, so that the network can reconstruct HSI with an arbitrarily determined scale while keeping the spectrum accuracy. We test the proposed method on three benchmark datasets, and it achieves state-of-the-art performance in SSR, cross-sensor SSR, and scale-arbitrary SSR tasks.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.