{"title":"LKDTNet: Large Kernel Deconstruction Three-Dimensional Network for micro-expression recognition","authors":"Zixuan Jie, Jian Wei, Qiankun Feng, Shigang Wang","doi":"10.1016/j.image.2026.117511","DOIUrl":"10.1016/j.image.2026.117511","url":null,"abstract":"<div><div>Micro-expression recognition (MER) is a crucial yet challenging task due to the subtle and transient nature of micro-expressions. Existing methods could not effectively capture interactions between different facial regions due to the limitations imposed by the conventional kernel mechanisms used in most deep learning frameworks and often suffer from significant information loss due to conventional downsampling techniques. To address these issues, we propose the Large Kernel Deconstruction Three-Dimensional Network (LKDTNet). Built upon a 3D Convolutional Neural Network (3D-CNN), LKDTNet integrates specialized modules to enhance feature extraction from micro-expression images effectively and efficiently. LKDTNet introduces an innovative facial deconstruction module that segments the face into four key regions. This design enables dynamic feature extraction, effectively capturing the interrelations between facial areas during micro-expressions. Additionally, the large kernel extraction module is employed to decompose large kernels into a series of smaller, depthwise convolutions. This technique maintains a wide receptive field, which is essential for capturing nuanced expression details. Furthermore, a advanced downsampling model is employed to Haar wavelet transforms. This allows for multi-scale, lossless feature decomposition, effectively preserving crucial information, thereby improving the model’s ability to accurately recognize subtle micro-expressions. The experiments on the SMIC, CASME II, and SAMM datasets, as well as their composite dataset, demonstrate that LKDTNet significantly outperforms state-of-the-art methods, achieving superior accuracy and robustness in micro-expression recognition tasks.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"143 ","pages":"Article 117511"},"PeriodicalIF":2.7,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146190554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AMS: Attention Map Seeds for enhancing interactive segmentation","authors":"Qingsong Lv , Jialong Zhu , Yunbo Rao , Yun Gao , Zhanglin Cheng","doi":"10.1016/j.image.2026.117506","DOIUrl":"10.1016/j.image.2026.117506","url":null,"abstract":"<div><div>Interactive segmentation has advanced significantly. A key challenge in these methods is the selection of query seeds, which often rely on previous seeds’ definitions. This can make it difficult to identify segmentation results that deviate from the ground truth, as users focus on regions with clear positive or negative samples. To address this, we propose a new approach using a visual heatmap-based interaction mechanism with Attention Map Seeds (AMS). AMS is generated by computing the difference between the Segment Gradient-weighted Class Activation Mapping (Seg-Grad-CAM) heatmap and the ground truth. Users receive sparse query seeds along with visual explanations in each interaction round, allowing them to observe and distinguish subtle differences between segmentation results and the ground truth across the entire image. We tested our algorithm on four publicly available datasets. Results show that, on average, AMS achieved a 4.1% higher <span><math><mrow><mi>D</mi><mi>i</mi><mi>c</mi><mi>e</mi></mrow></math></span> score per experimental round compared to the state-of-the-art (SOTA) in single-query seed testing. In multi-query seed testing, AMS needed 1.28 fewer clicks on average to achieve convergence precision compared to SOTA.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"143 ","pages":"Article 117506"},"PeriodicalIF":2.7,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146190555","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pixel-wise perception-distortion trade-off for single image super-resolution","authors":"Hai Su , Yanghui Wei , Zhenwen Jian , Songsen Yu","doi":"10.1016/j.image.2026.117504","DOIUrl":"10.1016/j.image.2026.117504","url":null,"abstract":"<div><div>In existing Single image super-solution (SISR) tasks, the balance between minimizing distortion and enhancing perceptual quality is usually achieved by adjusting the weight combination of losses. However, multiple pixels or even the entire image are generally processed as a whole, resulting in most pixels not obtaining their optimal combination of losses. To tackle this problem, we provide a new SISR framework that predicts and applies an appropriate combination of losses for every pixel. The trade-off between pixel-wise perception and distortion (PD) enables us to produce high-quality super-resolution (SR) outcomes. Our SISR framework is divided into two parts. In the first part, we designed a predictor that does not require HR images for inference. This predictor can calculate the degree of information loss of each pixel based on its perceived quality, and generate a pixel-wise PD trade-off weight (PWTO) map of the optimal weight combination for each pixel. In the second part, we designed a generator that can use PWTO maps as conditional inputs in the forward propagation process, and in the backpropagation during network training, the generator combines PWTO maps to achieve pixel level weight allocation. The experiments conducted on five benchmarks show that the proposed PWTOSR outperforms state-of-the-art PD trade-off SR methods in PSNR and LPIPS while achieving comparable results in SSIM and LR-PSNR. The PSNR, SSIM, LPIPS, and LR-PSNR results of the proposed PWTOSR on the DIV2K dataset are 27.87, 0.7948, 0.0976, and 50.42, respectively. The qualitative visualization results provide additional evidence of the efficacy of our technique.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"143 ","pages":"Article 117504"},"PeriodicalIF":2.7,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146190556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Image defogging algorithm based on metaheuristic dark channel prior","authors":"Bo Li, Rui Cao, Hongping Hu, Zhenwei Zhang","doi":"10.1016/j.image.2026.117512","DOIUrl":"10.1016/j.image.2026.117512","url":null,"abstract":"<div><div>Sky regions pose a significant challenge for defogging, often causing issues such as halos and color distortion. To address these issues, this paper proposes a novel defogging algorithm that integrates sky segmentation with an improved Red-billed Blue Magpie Optimization (RBMO) algorithm. Firstly, the sky region of the foggy image is obtained using a sky segmentation algorithm. Secondly, the principal component analysis is used to extract sky and non-sky regions features as weight parameters for fusion to calculate atmospheric light. Meanwhile, an improved RBMO algorithm is utilized to identify the ideal fusion weight parameters for calculating transmission. Finally, an enhancement step is applied to improve the visual quality of the defogged image. Experimental results on both synthetic and real-world datasets demonstrate that our proposed algorithm achieves superior performance in key evaluation metrics. This indicates its effectiveness in accurately segmenting sky regions, suppressing halos and artifacts, enhancing overall defogging quality, and preserving image details.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"143 ","pages":"Article 117512"},"PeriodicalIF":2.7,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146190560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jin Wan , Zhihao Liu , Hui Yin , Zhenyao Wu , Xinyi Wu , Song Wang
{"title":"Shadow vanishing point detection via combined human/shadow adaptive modulation","authors":"Jin Wan , Zhihao Liu , Hui Yin , Zhenyao Wu , Xinyi Wu , Song Wang","doi":"10.1016/j.image.2026.117508","DOIUrl":"10.1016/j.image.2026.117508","url":null,"abstract":"<div><div>A set of parallel lines in 3D space intersect at a common vanishing point when projected to a 2D image. Vanishing point detection plays an important role in 3D computer vision and previous works are usually focused on the parallel lines in man-made structures, such as buildings, railways, and lanes. Under sunlight, shadow directions in many instances, especially standing persons, are largely parallel on the ground. The detection of human/shadow vanishing points in 2D images can well complement others for 3D information, especially in scenes that lack man-made structures. However, this is a very challenging problem since human shadows do not present explicit parallel line features along the shadow directions. In this paper, we propose a new ShadowVPNet for shadow vanishing point detection, which consists of a combined human/shadow detector, a feature extractor and a vanishing point classifier. Adaptive modulation modules conditioned on the combined human/shadow priors are incorporated into the feature extractor for modulating the extracted features. We construct a new shadow vanishing point image dataset, as well as all the needed ground-truth annotations, for network training and testing. Experimental results demonstrate the effectiveness of the proposed method on shadow vanishing point detection. The code and dataset are released at <span><span>https://github.com/hhqweasd/SVPD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"143 ","pages":"Article 117508"},"PeriodicalIF":2.7,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146190557","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yumeng Yan , Mingming Kong , Maochao Zhang , Shunnan Zhao , Chao Zhang
{"title":"Refining data granularity and feature fusion for boundary refinement in instance segmentation","authors":"Yumeng Yan , Mingming Kong , Maochao Zhang , Shunnan Zhao , Chao Zhang","doi":"10.1016/j.image.2026.117490","DOIUrl":"10.1016/j.image.2026.117490","url":null,"abstract":"<div><div>Considerable efforts have been made in the development of current instance segmentation approaches, but the segmentation of mask boundaries remains a challenge. Feature maps with low spatial resolution, along with the small proportion of edge pixels in relation to the total pixel count, lead to inaccurate boundaries in instance masks. Furthermore, the parsing of feature maps in high resolution networks is typically at a low level, making it difficult for the network to learn deeper semantic features. This paper presents improvements to Boundary Patch Refinement (BPR) for Instance Segmentation to address the above issues. First, we improve the bounding box extraction methods utilized in the data processing, refining the granularity of the data. Second, we introduce a feature fusion approach specifically designed to optimize the feature fusion module within the backbone network. Third, we propose Deep enhancement and Memory optimization (DAM), a module that enhances the network’s ability to learn deeper features, improves its efficiency in acquiring semantic information, and substantially reduces the computational overhead during training. Experimental results demonstrate that our network yields notable improvements in both segmentation accuracy and computational efficiency and outperforms existing methods. The code is available at <span><span>https://github.com/njezmjez/RDGFBR</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"143 ","pages":"Article 117490"},"PeriodicalIF":2.7,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146001836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing few-shot semantic segmentation in remote sensing through magnitude-based pruning","authors":"Kingsley Amoafo , Godwin Banafo Akrong , Ebenezer Owusu","doi":"10.1016/j.image.2026.117492","DOIUrl":"10.1016/j.image.2026.117492","url":null,"abstract":"<div><div>Few-Shot Semantic Segmentation in remote sensing faces significant challenges owing to the limited availability of labeled data and the complexity of high-resolution imagery. To address these challenges, we propose a novel framework that integrates magnitude-based pruning with Cross-Matching and Self-Matching modules. By systematically pruning 30% of the redundant weights from the backbone network, we enhanced the feature extraction and segmentation accuracy while maintaining the model efficiency. The Cross-Matching Module establishes robust semantic correspondences between the support and query images, whereas the Self-Matching Module refines segmentation through intra-query correlations, incorporating spatial and semantic proximity to improve the feature consistency. Experimental evaluations of the DLRSD-5i and ISAID-5i datasets demonstrated the effectiveness of the proposed method. On DLRSD-5i, the pruned SCCNet achieved a mean mIoU improvement of +9.40 (1-shot) and +6.46 (5-shot) compared to the baseline, outperforming state-of-the-art models. Similarly, on ISAID-5i, the pruned ResNet-101 surpassed SOTA by +1.18 (1-shot) and +0.69 (5-shot) in mean mIoU. These results validate the effectiveness of pruning in optimizing the baseline model for FSSS tasks, thereby enhancing its ability to generalize and accurately segment complex remote-sensing imagery. We demonstrated that high-quality pruned feature maps can enhance segmentation accuracy without the need for additional enhancement modules. This approach not only improves segmentation performance but also provides valuable insights into the role of backbone optimization in FSSS. Our findings highlight the potential of magnitude-based pruning as a foundational strategy for aligning backbone optimization with the demands of few-shot tasks, thereby offering a scalable solution for remote sensing segmentation tasks.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"143 ","pages":"Article 117492"},"PeriodicalIF":2.7,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inpainting-assisted reversible authentication method for demosaiced image with enhanced recoverability","authors":"Wien Hong, Guan-Zhong Su, Tung-Shou Chen","doi":"10.1016/j.image.2026.117510","DOIUrl":"10.1016/j.image.2026.117510","url":null,"abstract":"<div><div>Modern image capture devices typically incorporate a Color Filter Array (CFA) to separate primary light colors for image capture. After undergoing demosaicing processing, the resulting image, known as a demosaiced image, comprises sampled and rebuilt components. Recent research has been concentrated on detecting tampering in demosaiced images, yet current methods face challenges when dealing with larger tampered areas, resulting in incomplete or rough recovery. This paper introduces a recoverable demosaiced image authentication technique. It extracts recovery codes from sampled components using the adaptive adjustment technique and embeds them into the rebuilt components through adaptive embedding. Authentication codes are calculated and embedded into the least significant bits of the rebuilt components. After authentication, tampered areas can be restored using recovery codes, while unrecoverable parts are repaired using image inpainting. If the marked image is untampered, the CFA image can be extracted to restore the original demosaiced image. Compared to previous state-of-the-art methods, our approach achieves a noticeable improvement in visual quality when repairing tampered areas that cover over 50% of the image.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"143 ","pages":"Article 117510"},"PeriodicalIF":2.7,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146190553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Advancing multimodal biometric image retrieval with sparse spectral graph convolution network and banyan tree growth optimization","authors":"D. Binu","doi":"10.1016/j.image.2026.117501","DOIUrl":"10.1016/j.image.2026.117501","url":null,"abstract":"<div><div>Recently, multimodal biometric systems have gained attention for enhancing recognition accuracy and robustness, yet they still face issues like noise interference, redundant features, low accuracy, and inefficient data integration. To overcome these complications, Advancing Multimodal Biometric Image Retrieval with Sparse Spectral Graph Convolution network and Banyan Tree Growth Optimization (MBI-SSGCN-BTGO) is proposed. Here, the input images are collected from soco-fingerprint-female-and-male, face-recognition, and mmu-iris-datasets. The input images are preprocessed using the Fast Guided Median Filter (FGMF) for contrast correction, image scaling, cropping, and normalization. Afterward, the Holistic Dynamic Frequency Transform (HDFT) is used to extract features from images. Then, Snow Ablation Optimization (SAO) is used to choose the most relevant features. The optimal features are used for image retrieval, aiding in identity verification prior to classification. The classification is done by Sparse Spectra Graph Convolutional Network (SSGCN) to classify the biometric system, such as woman and man for face-recognition dataset, female and male for soco-fingerprint-female-and-male dataset and left and right for mmu-iris-dataset. Finally, the Banyan Tree Growth Optimization (BTGO) algorithm is employed to optimize the weight parameters of SSGCN. By integrating BTGO, the model efficiently identifies optimal feature representations, improving convergence speed and overall classification performance. The proposed MBI-SSGCN-BTGO approach is implemented in Python and its performance is examined undersome metrics. The performance of MBI-SSGCN-BTGO technique attains 16.17 %, 17.43 %, 19.23 % lower False Acceptance Rate (FAR) and 29.45 %, 28.42 % and 29.11 % higher precision and 26.17 %, 27.43 % when compared with existing techniques respectively.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"143 ","pages":"Article 117501"},"PeriodicalIF":2.7,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146190540","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dynamic reversible data hiding for relational database using attribute value ordering","authors":"Jinda Zeng, Bo Ou","doi":"10.1016/j.image.2026.117509","DOIUrl":"10.1016/j.image.2026.117509","url":null,"abstract":"<div><div>As an essential medium for information storage, databases often need to be shared and transferred. At the same time, databases are threatened by privacy leakage such as piracy, illegal copying, and tampering. Consequently, reversible data hiding (RDH) for the database is designed as a solution to meet the need of copyright protection and tracking. Many RDH algorithms for database have been proposed recently, but the trade-off between the embedding capacity and the embedding distortion is still unsatisfactory. To remedy this, we propose a low-distortion RDH method for relational database by using attribute value ordering to fully exploit the inner and inter correlations among the attributes. We sort the tuples according to the order of the numerical attribute values and embed the message bits by modifying the maximum/minimum values. To achieve better embedding performance, the database is split into multiple groups, and each group is dynamically embedded with secret data using attribute value ordering. Experimental results show that the proposed method can achieve a large embedding capacity with relatively low distortion compared to the state-of-the-art methods.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"143 ","pages":"Article 117509"},"PeriodicalIF":2.7,"publicationDate":"2026-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146190538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}