{"title":"CASK-Net fusion: Multi branch approach for cross-age sketch face recognition","authors":"Ipsita Pattnaik , Amita Dev , A.K. Mohapatra","doi":"10.1016/j.image.2025.117369","DOIUrl":"10.1016/j.image.2025.117369","url":null,"abstract":"<div><div>Cross-Age Sketch Face Recognition targets the collective problem of Cross-Age Face Recognition (FR) and Sketch Face Recognition. Existing works discuss these problems individually, but no attempts towards collective version of these problems have been observed. In real life law enforcement, criminal and forensic investigations; the age and facial appearance of a subject may be different at sketch generation time and recognition time (present day). We therefore address this issue and propose a CASK-Net fusion approach to solve the collective problem of Cross-Age FR and Sketch FR. This paper presents a novel CASK-Net fusion architecture to capture discriminative features using multiple feature extractor branches including HOG, SIFT, CNN, LBP, ORB and Inception ResNetV2 (SOTA) respectively. The proposed approach grounds on extraction of age invariant features from sketch images of an individual for effective recognition. Our approach eliminates the requirement of modality conversion (sketch-photo) for recognition and provides less complex (transformation complexity is eliminated) solution. We also propose a benchmark Cross-Age Sketch (CASK) dataset to serve as a standard towards collective problem of Cross-Age FR and Sketch FR. The quantitative and ablation results highlight 95.52 % AUC-ROC performance and the fusion model achieved 93.37 % training accuracy (last epoch). Moreover, the SOTA comparison and dataset analysis confirms the model superiority with validation accuracy of 60.89 % on challenging and intrinsically hard CASK dataset.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"138 ","pages":"Article 117369"},"PeriodicalIF":3.4,"publicationDate":"2025-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144271276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis discriminative convolutional graph dictionary learning for generalized signal classification","authors":"Yuehan Xiong, Xin Li, Wenrui Dai, Hongkai Xiong","doi":"10.1016/j.image.2025.117356","DOIUrl":"10.1016/j.image.2025.117356","url":null,"abstract":"<div><div>Analysis discriminative dictionary learning (ADDL) techniques have been studied for addressing image classification problems. However, existing ADDL methods ignore the structural dependency within the signals and cannot fit the general class of signals with irregular structures, including spherical images and 3D objects. In this paper, we propose a novel analysis discriminative convolutional graph dictionary learning method that fully exploits the structural dependency for signal classification, especially for irregular graph signals. The proposed method integrates the graph embedding information to analysis convolutional dictionary learning to derive a set of class-specific convolutional graph sub-dictionaries for extracting consistent class-specific features. An analytical decorrelation term is introduced as regularization to constrain the linear classifier for each class and improve the discrimination ability of dictionary-based sparse representation. Furthermore, we develop an efficient alternating update algorithm to solve the formulated non-convex minimization problem that simultaneously achieves sparse representation using ISTA and optimizes the convolutional graph dictionary and classifiers in an analytic manner. To our best knowledge, this paper is the first attempt to achieve analysis dictionary learning for generalized classification of signals with regular and irregular structures. Experimental results show that the proposed method outperforms state-of-the-art discriminative dictionary learning methods by 0.26% to 2.68% in classification accuracy for both regular and irregular signal classification. Notably, it is comparable to recent deep learning models with up to about 1% accuracy loss in irregular signal classification.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"138 ","pages":"Article 117356"},"PeriodicalIF":3.4,"publicationDate":"2025-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144307657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Robust emotion recognition in thermal imaging with convolutional neural networks and grey wolf optimization","authors":"Anselme Atchogou , Cengiz Tepe","doi":"10.1016/j.image.2025.117363","DOIUrl":"10.1016/j.image.2025.117363","url":null,"abstract":"<div><div>Facial Expression Recognition (FER) is a pivotal technology in human-computer interaction, with applications spanning psychology, virtual reality, and advanced driver assistance systems. Traditional FER using visible light cameras faces challenges in low light conditions, shadows, and reflections. This study explores thermal imaging as an alternative, leveraging its ability to capture heat radiation and overcome lighting issues. We propose a novel approach that combines pre-trained models, particularly EfficientNet variants, with Grey Wolf Optimization (GWO) and various classifiers for robust emotion recognition. Ten pre-trained CNN models, including variants of EfficientNet (EfficientNet-B0, B3, B4, B7, V2L, V2M, V2S), ResNet50, MobileNet, and InceptionResNetV2, are utilized to extract features from thermal images. GWO is employed to optimize the parameters of four classifiers: Support Vector Machine (SVM), Random Forest, Gradient Boosting, and k-Nearest Neighbors (kNN). Two popular thermal image datasets, IRDatabase and KTFE, are used to assess the suggested methodology. Combining EfficientNet-B7 with GWO and kNN or SVM for eight distinct emotions (fear, anger, contempt, disgust, happiness, neutrality, sadness, and surprise) yielded the highest accuracy of 91.42 % on the IRDatabase dataset. Combining EfficientNet-B7 with GWO and Gradient Boosting for seven distinct emotions (anger, disgust, fear, happiness, neutrality, sadness, and surprise) yielded the highest accuracy of 99.48 % on the KTFE dataset. These results demonstrate the effectiveness and reliability of the proposed approach for emotion identification in thermal images, making it a viable way to overcome the drawbacks of conventional visible-light-based FER systems.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"138 ","pages":"Article 117363"},"PeriodicalIF":3.4,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144308067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liqun Lin , Yanting Wang , Jiaqi Liu , Hongan Wei , Bo Chen , Weiling Chen , Tiesong Zhao
{"title":"SJND: A Spherical Just Noticeable Difference Modelling for 360° video coding","authors":"Liqun Lin , Yanting Wang , Jiaqi Liu , Hongan Wei , Bo Chen , Weiling Chen , Tiesong Zhao","doi":"10.1016/j.image.2025.117354","DOIUrl":"10.1016/j.image.2025.117354","url":null,"abstract":"<div><div>The popularity of 360° video is due to its realistic and immersive experience, but the higher resolution poses challenges for data transmission and storage. Existing compression schemes for 360° videos mainly focus on spatial and temporal redundancy elimination, neglecting the removal of visual perception redundancy. To address this issue, we exploit the visual characteristics of 360° equirectangular projection to extend the popular Just Noticeable Difference model to Spherical Just Noticeable Difference. Our modeling takes advantage of the following factors: regional masking factor, which employs an entropy-based region classification and separately characterizes contrast masking effects on different regions; latitude projection characteristics, which model the impact of pixel-level warping during equirectangular projection mapping; field of view attention factor, which reflects the attention variation of the human visual system on 360° display. Subjective tests show that our Spherical Just Noticeable Difference model is consistent with user perceptions and also has a higher tolerance of distortions with reduced bit rates of 360° pictures. Further experiments on Versatile Video Coding also demonstrate that the introduction of the proposed model significantly reduces bit rates with negligible loss in perceived visual quality.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"138 ","pages":"Article 117354"},"PeriodicalIF":3.4,"publicationDate":"2025-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144240531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An improved Siamese network-based tracking method for UAV video with salt-and-pepper noise","authors":"Xin Lu , Yong Wang , Fusheng Li","doi":"10.1016/j.image.2025.117366","DOIUrl":"10.1016/j.image.2025.117366","url":null,"abstract":"<div><div>In practical application of unmanned aerial vehicle (UAV) monitoring system, the video in visual tracking system inevitably introduces salt-and-pepper (s&p) noise because of the interference from external environment. At this time, it could degrade the performance or lead to tracking failure. However, most existing tracking algorithms focus on the improvement of accuracy and robustness while ignore the video quality enhancement. To this end, this paper proposes a Siamese network-based tracker together with a denoising module to reduce the effect of s&p noise. The proposed tracker firstly adds a simpler version of adaptive median layer into feedforward denoising convolutional neural network to form a video quality enhancer to achieve the noise suppression. Then by means of Siamese network, the tracker extracts the features of preprocessed detection frame and template frame, and the region proposal network (RPN) is utilized to generate the foreground candidate box and its position. Thus, the proposed tracker can maintain the stability and effectiveness of tracking under the influences of the s&p noise. In addition, we construct a noisy DTB70 dataset with genetated noise for experimental validation. Experimental results show that the proposed method can track the target effectively at different noise levels. It is worth mentioning that our proposed method reports promising results on s&p noise even with high levels.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"138 ","pages":"Article 117366"},"PeriodicalIF":3.4,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144279799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenchao Li , Shuyuan Wen , Jinhao Zhu , Qiaofeng Ou , Yanchun Guo , Jiabao Chen , Bangshu Xiong
{"title":"ZERRIN-Net: Adaptive low-light image enhancement using Retinex decomposition and noise extraction","authors":"Wenchao Li , Shuyuan Wen , Jinhao Zhu , Qiaofeng Ou , Yanchun Guo , Jiabao Chen , Bangshu Xiong","doi":"10.1016/j.image.2025.117345","DOIUrl":"10.1016/j.image.2025.117345","url":null,"abstract":"<div><div>Low-light image enhancement aims at correcting the exposure of images taken under underexposed conditions while removing image noise and restoring image details. Most of the previous low-light image enhancement algorithms used hand-made a priori denoising in the corrected component; however, due to the large amount of detail information of the image contained in the corrected component and the presence of some pseudo-noise, the final enhancement results obtained by these solutions do not have the original noise removed, and the image details appear blurred. To solve the above problems, we propose ZERRIN-Net, a zero-shot low-light enhancement method based on Retinex decomposition. First of all, we first design the original noise extraction network N-Net, which can adaptively extract the original noise of low-light images without losing the detailed information of the images. In addition, we propose the decomposition network RI-Net, which is based on the Retinex principle and utilizes a simple self-supervised mechanism to help decompose a low-light image into a reflection component and a light component. In this paper, we conduct extensive experiments on numerous datasets as well as advanced vision tasks such as face detection, target recognition, and instance segmentation. The experimental results show that the performance of our method is competitive with current state-of-the-art methods. The code is available at: <span><span>https://github.com/liwenchao0615/ZERRINNet</span><svg><path></path></svg></span></div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"138 ","pages":"Article 117345"},"PeriodicalIF":3.4,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144147613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multiple-image encryption algorithm based on S-boxes and DNA sequences","authors":"Muhammad Umair Safdar , Tariq Shah , Asif Ali","doi":"10.1016/j.image.2025.117353","DOIUrl":"10.1016/j.image.2025.117353","url":null,"abstract":"<div><div>Image encryption is crucial for safeguarding sensitive visual data; however, traditional methods often encounter challenges regarding efficiency and adaptability to the unique characteristics of images. This research is motivated by the potential of ring-based algebraic structures to develop lightweight, secure, and efficient encryption schemes specifically designed for image data. The article presents a novel approach to image encryption in cryptography using a local ring algebraic structure. The proposed method involves encrypting multiple images by constructing substitution boxes from subsets, which are not subgroups but have identity and invertibility axioms. The challenge of using subsets for encryption purposes is addressed by taking unit elements of the ring, picking a subgroup, and splitting it into two subsets. The substitution box is generated by one of the subsets and used for the substitution process, while the other subset is mapped to the Galois field. It constructs the substitution box and is used for diffusion. A DNA sequence is applied to the red, green, and blue channels of the image, and a key is generated by hashing the image and using a subset of the subgroup of units of the ring. Finally, all channels are XORed with the key. The performance of the proposed scheme is evaluated using different analyses, and it is found that the scheme outperforms existing approaches. This approach presents a promising solution for image encryption in cryptography.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"138 ","pages":"Article 117353"},"PeriodicalIF":3.4,"publicationDate":"2025-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144177675","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Motion-blurred image restoration in haze weather based on the color line model","authors":"Jiamin Li, Hongping Hu, Yanping Bai","doi":"10.1016/j.image.2025.117352","DOIUrl":"10.1016/j.image.2025.117352","url":null,"abstract":"<div><div>Many dehazing algorithms had been developed for hazy images caused by industrial pollution and hazy weather. However, most of the dehazing algorithms only consider the restoration of images in static scenes and ignore the restoration of hazy images in dynamic scenes. Based on this, this paper proposed an image recovery algorithm based on the color line model for dynamic scenes under hazy weather. The algorithm was divided into two parts: To improve image contrast by image dehazing and to improve image clarity by deblurring the dynamic scene. Firstly, the watershed algorithm was used to divide the image into foreground and background; Secondly, the color line model was used to restore the motion blur of the foreground image; At the same time, the color line model was adopted to dehazing the hazy background image; And finally, the foreground image and the background image were fused. The experimental results were shown that compared with other mainstream dehazing and deblurring algorithms, the recovered image of this paper's algorithm performs well in terms of both color assurance and texture details, and has better evaluation indexes.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"138 ","pages":"Article 117352"},"PeriodicalIF":3.4,"publicationDate":"2025-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144168661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hyperspectral mixed noise removal using nonconvex low-rank and total generalized variation","authors":"Xinwu Liu","doi":"10.1016/j.image.2025.117344","DOIUrl":"10.1016/j.image.2025.117344","url":null,"abstract":"<div><div>To better preserve the structural features while removing the mixed noise in hyperspectral image (HSI), this paper presents a novel HSI denoising method based on nonconvex low-rank (NLR) and total generalized variation (TGV) minimization. The proposed NLRTGV solver closely incorporates the advantages of TGV regularization, nonconvex nuclear norm, and <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>-norm. More specifically, the TGV regularizer, which maintains the spatial structure features, is adopted to eliminate Gaussian noise and prevent the staircase artifacts. The usage of nonconvex penalty is to explore the spectral low-rank properties, which contributes to suppress the sparse noise and preserve the major data components. Besides, we further employ the <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>-norm regularization to detect the sparse noise that includes impulse noise, deadlines and stripes. Computationally, by combining the iteratively reweighted <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span> algorithm, singular value shrinkage method and primal-dual framework, we construct in detail a modified alternating direction method of multipliers to solve the resulting optimization problem. Finally, as evidently demonstrated in both simulated and real-world HSI datasets experiments, our proposed approach shows the outstanding denoising performance in terms of mixed noise removal and detail features preservation, over the existing state-of-the-art competitors.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"138 ","pages":"Article 117344"},"PeriodicalIF":3.4,"publicationDate":"2025-05-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144147612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antonio Luigi Stefani , Niccolò Bisagno , Andrea Rosani , Nicola Conci , Francesco De Natale
{"title":"Signal processing for haptic surface modeling: A review","authors":"Antonio Luigi Stefani , Niccolò Bisagno , Andrea Rosani , Nicola Conci , Francesco De Natale","doi":"10.1016/j.image.2025.117338","DOIUrl":"10.1016/j.image.2025.117338","url":null,"abstract":"<div><div>Haptic feedback has been integrated into Virtual and Augmented Reality, complementing acoustic and visual information and contributing to an all-round immersive experience in multiple fields, spanning from the medical domain to entertainment and gaming. Haptic technologies involve complex cross-disciplinary research that encompasses sensing, data representation, interactive rendering, perception, and quality of experience. The standard processing pipeline, consists of (I) sensing physical features in the real world using a transducer, (II) modeling and storing the collected information in some digital format, (III) communicating the information, and finally, (IV) rendering the haptic information through appropriate devices, thus producing a user experience (V) perceptually close to the original physical world. Among these areas, sensing, rendering and perception have been deeply investigated and are the subject of different comprehensive surveys available in the literature. Differently, research dealing with haptic surface modeling and data representation still lacks a comprehensive dissection. In this work, we aim at providing an overview on modeling and representation of haptic surfaces from a signal processing perspective, covering the aspects that lie in between haptic information acquisition on one side and rendering and perception on the other side. We analyze, categorize, and compare research papers that address the haptic surface modeling and data representation, pointing out existing gaps and possible research directions.</div></div>","PeriodicalId":49521,"journal":{"name":"Signal Processing-Image Communication","volume":"138 ","pages":"Article 117338"},"PeriodicalIF":3.4,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144131260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}