Signal Processing-Image Communication最新文献_第9页

SAR-CDCFRN: A novel SAR despeckling approach utilizing correlated dual channel feature-based residual network SAR- cdcfrn：一种基于相关双通道残差网络的SAR去噪方法

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-01-16 DOI: 10.1016/j.image.2025.117267

Anirban Saha, Arihant K.R., Suman Kumar Maji

{"title":"SAR-CDCFRN: A novel SAR despeckling approach utilizing correlated dual channel feature-based residual network","authors":"Anirban Saha, Arihant K.R., Suman Kumar Maji","doi":"10.1016/j.image.2025.117267","DOIUrl":"10.1016/j.image.2025.117267","url":null,"abstract":"<div><div>As a result of the increasing need for capturing and processing visual data of the Earth’s surface, Synthetic Aperture Radar (SAR) technology has been widely embraced by all space research organisations. The primary drawback in the acquired SAR visuals (images) is the presence of unwanted granular noise, called “speckle”, which poses a limitation to their processing and analysis. Therefore removing this unwanted speckle noise from the captured SAR visuals, a process known as despeckling, becomes an important task. This article introduces a new despeckling residual network named SAR-CDCFRN. This network simultaneously extracts speckle components from both the spatial and inverse spatial channels. The extracted features are then correlated by a dual-layer attention block and further processed to predict the distribution of speckle in the input noisy image. The predicted distribution, which is the residual noise, is then mapped with the input noisy SAR data to generate a despeckled output image. Experimental results confirm the superiority of the proposed despeckling model over other existing technologies in the literature.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"133 ","pages":"Article 117267"},"PeriodicalIF":3.4,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143128020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Approximation-based energy-efficient cyber-secured image classification framework 基于近似的节能网络安全图像分类框架

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-01-16 DOI: 10.1016/j.image.2025.117261

M.A. Rahman , Salma Sultana Tunny , A.S.M. Kayes , Peng Cheng , Aminul Huq , M.S. Rana , Md. Rashidul Islam , Animesh Sarkar Tusher

{"title":"Approximation-based energy-efficient cyber-secured image classification framework","authors":"M.A. Rahman , Salma Sultana Tunny , A.S.M. Kayes , Peng Cheng , Aminul Huq , M.S. Rana , Md. Rashidul Islam , Animesh Sarkar Tusher","doi":"10.1016/j.image.2025.117261","DOIUrl":"10.1016/j.image.2025.117261","url":null,"abstract":"<div><div>In this work, an energy-efficient cyber-secured framework for deep learning-based image classification is proposed. This simultaneously addresses two major concerns in relevant applications, which are typically handled separately in the existing works. An image approximation-based data storage scheme to improve the efficiency of memory usage while reducing energy consumption at both the source and user ends is discussed. Also, the proposed framework mitigates the impacts of two different adversarial attacks, notably retaining performance. The experimental analysis signifies the academic and industrial importance of this work as it demonstrates reductions of 62.5% in energy consumption for image classification when accessing memory and in the effective memory sizes of both ends by the same amount. During the improvement of memory efficiency, the multi-scale structural similarity index measure (MS-SSIM) is found to be the optimum image quality assessment method among different similarity-based metrics for the image classification task with approximated images and an average image quality of 0.9449 in terms of MS-SSIM is maintained. Also, a comparative analysis of three different classifiers with different depths indicates that the proposed scheme maintains up to 90.17% of original classification accuracy under normal and cyber-attack scenarios, effectively defending against untargeted and targeted white-box adversarial attacks with varying parameters.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"133 ","pages":"Article 117261"},"PeriodicalIF":3.4,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143127943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Spiking two-stream methods with unsupervised STDP-based learning for action recognition 基于无监督stdp学习的动作识别尖峰双流方法

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-01-16 DOI: 10.1016/j.image.2025.117263

Mireille El-Assal, Pierre Tirilly, Ioan Marius Bilasco

{"title":"Spiking two-stream methods with unsupervised STDP-based learning for action recognition","authors":"Mireille El-Assal, Pierre Tirilly, Ioan Marius Bilasco","doi":"10.1016/j.image.2025.117263","DOIUrl":"10.1016/j.image.2025.117263","url":null,"abstract":"<div><div>Video analysis is a computer vision task that is useful for many applications like surveillance, human-machine interaction, and autonomous vehicles. Deep learning methods are currently the state-of-the-art methods for video analysis. Particularly, two-stream methods, which leverage both spatial and temporal information, have proven to be valuable in Human Action Recognition (HAR). However, they have high computational costs, and need a large amount of labeled data for training. In addressing these challenges, this paper adopts a more efficient approach by leveraging Convolutional Spiking Neural Networks (CSNNs) trained with the unsupervised Spike Timing-Dependent Plasticity (STDP) learning rule for action classification. These networks represent the information using asynchronous low-energy spikes, which allows the network to be more energy efficient when implemented on neuromorphic hardware. Furthermore, learning visual features with unsupervised learning reduces the need for labeled data during training, making the approach doubly advantageous. Therefore, we explore transposing two-stream convolutional neural networks into the spiking domain, where we train each stream with the unsupervised STDP learning rule. We investigate the performance of these networks in video analysis by employing five distinct configurations for the temporal stream, and evaluate them across four benchmark HAR datasets. In this work, we show that two-stream CSNNs can successfully extract spatio-temporal information from videos despite using limited training data, and that the spiking spatial and temporal streams are complementary. We also show that replacing a dedicated temporal stream with a spatio-temporal one within a spiking two-stream architecture leads to information redundancy that hinders the performance.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"134 ","pages":"Article 117263"},"PeriodicalIF":3.4,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143171428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Conditional Laplacian pyramid networks for exposure correction 曝光校正的条件拉普拉斯金字塔网络

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-01-16 DOI: 10.1016/j.image.2025.117276

Mengyuan Huang , Kan Chang , Qingpao Qin , Yahui Tang , Guiqing Li

{"title":"Conditional Laplacian pyramid networks for exposure correction","authors":"Mengyuan Huang , Kan Chang , Qingpao Qin , Yahui Tang , Guiqing Li","doi":"10.1016/j.image.2025.117276","DOIUrl":"10.1016/j.image.2025.117276","url":null,"abstract":"<div><div>Improper exposures greatly degenerate the visual quality of images. Correcting various exposure errors in a unified framework is challenging as it requires simultaneously handling global attributes and local details under different exposure conditions. In this paper, we propose a conditional Laplacian pyramid network (CLPN) for correcting different exposure errors in the same framework. It applies Laplacian pyramid to decompose an improperly exposed image into a low-frequency (LF) component and several high-frequency (HF) components, and then enhances the decomposed components in a coarse-to-fine manner. To consistently correct a wide range of exposure errors, a conditional feature extractor is designed to extract the conditional feature from the given image. Afterwards, the conditional feature is used to guide the refinement of LF features, so that a precisely correction for illumination, contrast and color tone can be obtained. As different frequency components exhibit pixel-wise correlations, the frequency components in lower pyramid layers are used to support the reconstruction of the HF components in higher layers. By doing so, fine details can be effectively restored, while noises can be well suppressed. Extensive experiments show that our method is more effective than state-of-the-art methods on correcting various exposure conditions ranging from severe underexposure to intense overexposure.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"134 ","pages":"Article 117276"},"PeriodicalIF":3.4,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143171430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Graph-based image captioning with semantic and spatial features 具有语义和空间特征的基于图的图像字幕

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-01-15 DOI: 10.1016/j.image.2025.117273

Mohammad Javad Parseh, Saeed Ghadiri

{"title":"Graph-based image captioning with semantic and spatial features","authors":"Mohammad Javad Parseh, Saeed Ghadiri","doi":"10.1016/j.image.2025.117273","DOIUrl":"10.1016/j.image.2025.117273","url":null,"abstract":"<div><div>Image captioning is a challenging task of image processing that aims to generate descriptive and accurate textual descriptions for images. In this paper, we propose a novel image captioning framework that leverages the power of spatial and semantic relationships between objects in an image, in addition to traditional visual features. Our approach integrates a pre-trained model, RelTR, as a backbone for extracting object bounding boxes and subject-predicate-object relationship pairs. We use these extracted relationships to construct spatial and semantic graphs, which are processed through separate Graph Convolutional Networks (GCNs) to obtain high-level contextualized features. At the same time, a CNN model is employed to extract visual features from the input image. To merge the feature vectors seamlessly, our approach involves using a multi-modal attention mechanism that is applied separately to the feature maps of the image, the nodes of the semantic graph, and the nodes of the spatial graph during each time step of the LSTM-based decoder. The model concatenates the attended features with the word embedding at the respective time step and fed into the LSTM cell. Our experiments demonstrate the effectiveness of our proposed approach, which competes closely with existing state-of-the-art image captioning techniques by capturing richer contextual information and generating accurate and semantically meaningful captions.</div><div>© 2025 Elsevier Inc. All rights reserved.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"133 ","pages":"Article 117273"},"PeriodicalIF":3.4,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143093314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ATM-DEN: Image Inpainting via attention transfer module and Decoder-Encoder network ATM-DEN：通过注意转移模块和解码器网络进行图像绘制

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-01-14 DOI: 10.1016/j.image.2025.117268

Siwei Zhang , Yuantao Chen

{"title":"ATM-DEN: Image Inpainting via attention transfer module and Decoder-Encoder network","authors":"Siwei Zhang , Yuantao Chen","doi":"10.1016/j.image.2025.117268","DOIUrl":"10.1016/j.image.2025.117268","url":null,"abstract":"<div><div>The current prevailing techniques for image restoration predominantly employ self-encoding and decoding networks, aiming to reconstruct the original image during the decoding phase utilizing the compressed data captured during encoding. Nevertheless, the self-encoding network inherently suffers from information loss during compression, rendering it challenging to achieve nuanced restoration outcomes solely reliant on compressed information, particularly manifesting as blurred imagery and distinct edge artifacts around the restored areas. To mitigate this issue of insufficient image information utilization, we introduce a Multi-Stage Decoding Network in this study. This network leverages multiple decoders to decode and integrate features from each layer of the encoding stage, thereby enhancing the exploitation of encoder features across various scales. Subsequently, a feature mapping is derived that more accurately captures the content of the impaired region. Comparative experiments conducted on globally recognized datasets demonstrate that MSDN achieves a notable enhancement in the visual quality of restored images.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"133 ","pages":"Article 117268"},"PeriodicalIF":3.4,"publicationDate":"2025-01-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143127942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

FLQ: Design and implementation of hybrid multi-base full logarithmic quantization neural network acceleration architecture based on FPGA 基于FPGA的混合多基全对数量化神经网络加速体系结构设计与实现

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-01-13 DOI: 10.1016/j.image.2025.117270

Longlong Zhang , Xiang Hu , Xuan Liao , Tong Zhou , Yuanxi Peng

{"title":"FLQ: Design and implementation of hybrid multi-base full logarithmic quantization neural network acceleration architecture based on FPGA","authors":"Longlong Zhang , Xiang Hu , Xuan Liao , Tong Zhou , Yuanxi Peng","doi":"10.1016/j.image.2025.117270","DOIUrl":"10.1016/j.image.2025.117270","url":null,"abstract":"<div><div>As deep neural network (DNN) models become more accurate, problems such as large model parameters and high computational complexity have become increasingly prominent, leading to a bottleneck in deploying them on resource-limited embedded platforms. In recent years, logarithm-based quantization techniques have shown great potential in reducing the inference cost of neural networks. However, current single-model log-quantization has reached an upper limit of classification performance, and little work has investigated hardware implementation of neural network quantization. In this paper, we propose a full logarithmic quantization (FLQ) mechanism that quantizes both weights and activation values into the logarithmic domain, compressing the parameters on AlexNet and VGG16 model by >6.4 times while maintaining an accuracy loss of within 2.5 % compared with benchmarking. Furthermore, we propose two optimization solutions for FLQ: activation segmented full logarithmic quantization (ASFLQ) and multi-ratio activation segmented full logarithmic quantization (Multi-ASFLQ), which can better balance the numerical representation range and quantization step. Under the condition of weight quantization of 5 bits and activation value quantization of 4 bits, the optimization methods proposed in this paper can improve the TOP1 of the VGG16 network model by 1 % and 1.6 %, respectively. Subsequently, we propose an implementation scheme of computing unit corresponding to the optimized FLQ mechanism above, which can not only convert multiplication operations into a shift operation but also integrate functions such as different ratio logarithmic bases and sparsity processing for activation, minimizing resource consumption as well as avoiding unnecessary calculations. Finally, we experiment with VGG19, Retnet50, and Densenet169 models, proving that the proposed method can achieve good performance under lower bit quantization. © 2001 Elsevier Science. All rights reserved</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"134 ","pages":"Article 117270"},"PeriodicalIF":3.4,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143171403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Image super-resolution based on multifractals in transfer domain 基于传递域多重分形的图像超分辨率

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2025-01-03 DOI: 10.1016/j.image.2024.117221

Xunxiang Yao, Qiang Wub, Peng Zhange, Fangxun Baod

{"title":"Image super-resolution based on multifractals in transfer domain","authors":"Xunxiang Yao, Qiang Wub, Peng Zhange, Fangxun Baod","doi":"10.1016/j.image.2024.117221","DOIUrl":"10.1016/j.image.2024.117221","url":null,"abstract":"<div><div>The goal of image super-resolution technique is to reconstruct high-resolution image with fine texture details from its low-resolution version.On Fourier domain,such fine details are more related to the information in the highfrequency spectrum. Most of existing methods do not have specific modules to handle such high-frequency information adaptively. Thus, they cause edge blur or texture disorder. To tackle the problems, this work explores image super-resolution on multiple sub-bands of the corresponding image, which are generated by NonSubsampled Contourlet Transform (NSCT). Different sub-bands hold the information of different frequency which is then related to the detailedness of information of the given low-resolution image.In this work, such image information detailedness is formulated as image roughness. Moreover, fractals analysis is applied to each sub-band image. Since fractals can mathematically represent the image roughness, it then is able to represent the detailedness (i.e. various frequency of image information). Overall, a multi-fractals formulation is established based on multiple sub-bands image. On each sub-band, different fractals representation is created adaptively. In this way, the image super-resolution process is transformed into a multifractal optimization problem. The experiment result demonstrates the effectiveness of the proposed method in recovering high-frequency details.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"133 ","pages":"Article 117221"},"PeriodicalIF":3.4,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143420492","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Middle-output deep image prior for blind hyperspectral and multispectral image fusion 中输出深度图像先验的盲高光谱与多光谱图像融合

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2024-12-26 DOI: 10.1016/j.image.2024.117247

Jorge Bacca , Christian Arcos , Juan Marcos Ramírez , Henry Arguello

{"title":"Middle-output deep image prior for blind hyperspectral and multispectral image fusion","authors":"Jorge Bacca , Christian Arcos , Juan Marcos Ramírez , Henry Arguello","doi":"10.1016/j.image.2024.117247","DOIUrl":"10.1016/j.image.2024.117247","url":null,"abstract":"<div><div>Obtaining a low-spatial-resolution hyperspectral image (HS) or low-spectral-resolution multispectral (MS) image from a high-resolution (HR) spectral image is straightforward with knowledge of the acquisition models. However, the reverse process, from HS and MS to HR, is an ill-posed problem known as spectral image fusion. Although recent fusion techniques based on supervised deep learning have shown promising results, these methods require large training datasets involving expensive acquisition costs and long training times. In contrast, unsupervised HS and MS image fusion methods have emerged as an alternative to data demand issues; however, they rely on the knowledge of the linear degradation models for optimal performance. To overcome these challenges, we propose the Middle-Output Deep Image Prior (MODIP) for unsupervised blind HS and MS image fusion. MODIP is adjusted for the HS and MS images, and the HR fused image is estimated at an intermediate layer within the network. The architecture comprises two convolutional networks that reconstruct the HR spectral image from HS and MS inputs, along with two networks that appropriately downscale the estimated HR image to match the available MS and HS images, learning the non-linear degradation models. The network parameters of MODIP are jointly and iteratively adjusted by minimizing a proposed loss function. This approach can handle scenarios where the degradation operators are unknown or partially estimated. To evaluate the performance of MODIP, we test the fusion approach on three simulated spectral image datasets (Pavia University, Salinas Valley, and CAVE) and a real dataset obtained through a testbed implementation in an optical lab. Extensive simulations demonstrate that MODIP outperforms other unsupervised model-based image fusion methods by up to 6 dB in PNSR.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"132 ","pages":"Article 117247"},"PeriodicalIF":3.4,"publicationDate":"2024-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143148378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AggNet: Learning to aggregate faces for group membership verification AggNet：学习聚合面孔以进行组成员验证

IF 3.4 3区工程技术

Signal Processing-Image Communication Pub Date : 2024-12-02 DOI: 10.1016/j.image.2024.117237

Marzieh Gheisari , Javad Amirian , Teddy Furon , Laurent Amsaleg

引用次数: 0