{"title":"Evaluating the substitutability of generative AI-generated faces in biometric applications: From a lens of age, gender, ethnicity detection","authors":"Tanusree Ghosh , Baisnabi Seth , Subhashis Kar , Ruchira Naskar","doi":"10.1016/j.patrec.2025.08.013","DOIUrl":"10.1016/j.patrec.2025.08.013","url":null,"abstract":"<div><div>Despite widespread use in sensitive applications, biometric technologies often underperform for underrepresented population groups. Although creating balanced datasets to train biometric models can help, relying on real face images raises ethical, legal, and logistical concerns. To address these challenges, we utilize LLM-based image generation to produce a hyper-realistic synthetic face dataset that covers diverse ages, genders, and ethnicities, created using seven state-of-the-art (SOTA) diffusion-based models. We evaluate twelve SOTA biometric attribute classifiers on both conventional real-face datasets and our synthetic dataset. Our results reveal that SOTA biometric models trained on synthetic faces, perform impressively close and many times better compared to when trained with real images. These findings indicate that synthetic datasets hold promise as viable substitutes for real data. Our findings simultaneously open up an equally important need for improving the cross-(real)-dataset generalization capabilities of current biometric detectors. We conclude by proposing directions for improved ‘substitutability’, including enhanced model architectures and refined synthetic image generation to boost overall generalization.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"197 ","pages":"Pages 257-266"},"PeriodicalIF":3.3,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144904429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An optimized ridge regression for forecasting time series with a fixed period","authors":"Chuang Han , Yusen Zhou , Jiajia Sun , Zuhe Li","doi":"10.1016/j.patrec.2025.08.017","DOIUrl":"10.1016/j.patrec.2025.08.017","url":null,"abstract":"<div><div>Ridge regression, a classic machine learning algorithm, is widely used in various time series forecasting tasks due to its simplicity and rational modeling mechanism. However, its inherent structural simplicity limits its ability to effectively model time series with a fixed period. To improve this situation, the concepts of the cross-period accumulation and self-estimated seasonal indices are introduced. Specifically, the existing efficient accumulation operations are utilized to perform cross-period accumulation on the time series to improve the modeling diversity. Following this, seasonal indices are embedded into the model structure and treated as parameters to be estimated directly from the data. This approach equips the model with the ability to identify and capture periodicity autonomously, without relying on manually predefined indices. To further improve the model’s generalization ability and forecasting accuracy, two optimization methods are employed: the Whale Optimization Algorithm (WOA) and grid search. The combination of these methods ensures robust hyperparameter tuning and promotes model stability across different forecasting scenarios. Experimental results demonstrate that the proposed model outperforms competing models in terms of MAPE, MSE, and MAE on the monthly Total Renewable Energy Consumption (TREC) dataset from the United States and the quarterly GDP dataset from China. Notably, the proposed model achieves MAPE values of 0.7443% and 1.4290% on these datasets, respectively, validating its effectiveness.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"197 ","pages":"Pages 274-281"},"PeriodicalIF":3.3,"publicationDate":"2025-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144913483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Zero-shot KWS for children’s speech using layer-wise features from SSL models","authors":"Subham Kutum , Abhijit Sinha , Hemant Kumar Kathania , Sudarsana Reddy Kadiri , Mahesh Chandra Govil","doi":"10.1016/j.patrec.2025.08.010","DOIUrl":"10.1016/j.patrec.2025.08.010","url":null,"abstract":"<div><div>Numerous methods have been proposed to enhance Keyword Spotting (KWS) in adult speech, but children’s speech presents unique challenges for KWS systems due to its distinct acoustic and linguistic characteristics. This paper introduces a zero-shot KWS approach that leverages state-of-the-art self-supervised learning (SSL) models, including Wav2Vec2, HuBERT and Data2Vec. Features are extracted layer-wise from these SSL models and used to train a Kaldi-based DNN KWS system. The WSJCAM0 adult speech dataset was used for training, while the PFSTAR children’s speech dataset was used for testing, demonstrating the zero-shot capability of our method. Our approach achieved state-of-the-art results across all keyword sets for children’s speech. Notably, the Wav2Vec2 model, particularly layer 22, performed the best, delivering an ATWV score of 0.691, a MTWV score of 0.7003 and probability of false alarm (<span><math><msub><mrow><mi>P</mi></mrow><mrow><mtext>fa</mtext></mrow></msub></math></span>) and probability of miss (<span><math><msub><mrow><mi>P</mi></mrow><mrow><mtext>miss</mtext></mrow></msub></math></span>) of 0.0164 and 0.0547 respectively, for a set of 30 keywords. Furthermore, age-specific performance evaluation confirmed the system’s effectiveness across different age groups of children. To assess the system’s robustness against noise, additional experiments were conducted using the best-performing layer of the best-performing Wav2Vec2 model. The results demonstrated a significant improvement over traditional MFCC-based baseline, emphasizing the potential of SSL embeddings even in noisy conditions. To further generalize the KWS framework, the experiments were repeated for an additional CMU dataset. Statistical analyses including paired t-tests and Wilcoxon signed-rank tests were also performed, confirming that the observed improvements are statistically significant and further validating the reliability of the proposed framework. Overall the results highlight the significant contribution of SSL features in enhancing Zero-Shot KWS performance for children’s speech, effectively addressing the challenges associated with the distinct characteristics of child speakers.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"197 ","pages":"Pages 304-311"},"PeriodicalIF":3.3,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144917134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jing Dong , Chenchen Wang , Hui Fang , Rui Liu , Wei Wu , Pengfei Yi , Ping Xu , Yu Sui
{"title":"HSE-GNN: A hierarchical skeleton embedded graph neural network for 3D human pose estimation","authors":"Jing Dong , Chenchen Wang , Hui Fang , Rui Liu , Wei Wu , Pengfei Yi , Ping Xu , Yu Sui","doi":"10.1016/j.patrec.2025.08.014","DOIUrl":"10.1016/j.patrec.2025.08.014","url":null,"abstract":"<div><div>3D human pose estimation (3D-HPE) from monocular images has raised significant attention due to its various applications for human action recognition and human–computer interaction. Latest graph neural network (GNN) exploits skeleton information from neighboring joints to enhance feature representations, thus achieving impressive performance. However, it is difficult to define contextual joints since those with longer distances may contain correlated features while engaging irrelevant joints with closer distances leads to overfitting issue, thus compromising model generalisation. In this paper, we propose a novel graph neural network, named HSE-GNN, to leverage more correlated joints for accurate 3D-HPE. Based on high-order local connection network that relations between key joint nodes are more reliable at a low-resolution level while joint nodes on same limbs are more correlated at a finer level, we embed a multi-scale skeleton joint representations into our GNN feature extractor to explore contextual joint information hierarchically. This design ensures more correlated joints are explored in our model to maximise the performance gain from the GNN. Furthermore, it implicitly enforces an extra geometric constraint for better accuracy. Extensive experimental results convincingly demonstrate that our method has achieved the state-of-the-art performance with an average error of 42.2 mm on the H36M dataset. Additionally, our method exhibits its strong generalisability when tested on an unseen dataset, i.e., the 3DHP dataset.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"197 ","pages":"Pages 243-249"},"PeriodicalIF":3.3,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144896474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinlong Wang , Xingtao Yang , Haoran Zhao , Yunting Wu , Xin Sun
{"title":"Dynamic density adaptive clustering and blockchain integration for efficient federated learning","authors":"Jinlong Wang , Xingtao Yang , Haoran Zhao , Yunting Wu , Xin Sun","doi":"10.1016/j.patrec.2025.08.009","DOIUrl":"10.1016/j.patrec.2025.08.009","url":null,"abstract":"<div><div>As an emerging distributed machine learning technology, Federated Learning (FL) allows multiple participants to collaboratively learn a shared model without exchanging their data, which alleviates privacy and security concerns. However, due to device heterogeneity and non-independent identically distributed data, existing synchronous and asynchronous FL strategies face low model training efficiency and convergence issues. To address the above issues, we propose a Dynamic Density Adaptive Clustering based Blockchain Integration approach for Efficient Federated Learning. To be specific, we introduce a Dynamic Density Adaptive Clustering (DDAC) algorithm for the adaptive grouping of clients, which enhances the rationality of groupings and reduces the differences among client models within groups. Then, the Decentralized Model Optimization and Aggregation (DMOA) strategy has been proposed to reduce waiting times and model obsolescence while ensuring model consistency and convergence. Finally, we provide the Dual-Norm Robust Convergence Optimization (DRCO) algorithm to address the issue of model update discrepancies caused by asynchronous updates between client groups in DMOA. By this manner, we provide a robust and efficient communication solution in heterogeneous and independent identically distributed (IID) FL environments. Experimental results demonstrate that this framework outperforms several benchmark algorithms in terms of accuracy and efficiency on the CIFAR-10 and MNIST datasets.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"197 ","pages":"Pages 267-273"},"PeriodicalIF":3.3,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144904430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"CSSNet: A 3D medical image segmentation network based on compressed sparse dual-branch structure","authors":"Xiao Luan , Yule Fu , Linghui Liu , Weisheng Li","doi":"10.1016/j.patrec.2025.08.006","DOIUrl":"10.1016/j.patrec.2025.08.006","url":null,"abstract":"<div><div>Medical image segmentation plays a crucial role in disease assessment, background noise and low contrast between different tissues pose significant challenges to accurate medical image segmentation. Existing methods either use CNNs to extract local features or use Transformers to capture global context information of medical images, and they overlook how to exploit the pixel differences among different tissues for image segmentation. In this paper, we propose a dual-branch method named Compressed Sparse Segmentation Network (CSSNet), to reduce background interference while enhancing the differentiation between different categories. For the bottleneck coding branch, we use information bottleneck-based compression coding method in the bottleneck layer to remove irrelevant information while preserving information relevant to the target label. For the global attention sparsification branch, we introduce the global attention mechanism at the bottleneck layer to capture long-range dependencies. Then, we sparsify the features to enhance the differences among different categories. Furthermore, we propose a Multi-scale Attention Fusion (MAF) module to combine the encoder features of two branches and integrate them into the decoder of the global attention sparsification branch. We conduct experiments on the MRI datasets of BraTS 2021 and iSeg 2019, and the CT dataset of MSD spleen. The experimental results on these three public datasets demonstrate the efficacy of the proposed CSSNet over current popular methods. In particular, our model achieves the best mean Dice score and HD95 score on the BraTS 2021 test set.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"197 ","pages":"Pages 222-228"},"PeriodicalIF":3.3,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144896471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiacheng Zhao , Meng Zhao , Yao Zhang , Xinpeng Zhang , Bochen Ma
{"title":"GlueHardener: Correspondence enhancement with geometric constraints for visual odometry","authors":"Jiacheng Zhao , Meng Zhao , Yao Zhang , Xinpeng Zhang , Bochen Ma","doi":"10.1016/j.patrec.2025.08.015","DOIUrl":"10.1016/j.patrec.2025.08.015","url":null,"abstract":"<div><div>In visual odometry, the challenge of accurate pose estimation is compounded by moving objects that disrupt feature matching. Sparse matchers like LightGlue and SuperGlue, while reliable, struggle with dynamic elements, leading to errors in solvers such as RANSAC. To address this issue, we introduce GlueHardener, a geometrically constrained inlier refiner designed to distinguish between inliers and outliers within correspondences, effectively filtering out outliers associated with dynamic objects that compromise the identification of the correct essential matrix model. Firstly, our model conceptualizes each match point as a node within a graph, aggregating information from diverse neighbor types. Then, a hierarchical learning process is employed to empower the model to discern inliers from outliers: at the local level, it assesses the differences between nodes from both the feature and physical distance dimensions; while globally, it enables each node to assimilate information from high-degree nodes. Experimental results demonstrate that our model performs effectively across different environments in the odometry dataset, successfully filtering out dynamic interference points and enhancing pose accuracy in scenarios with significant vehicular movement. Improvement in trajectory accuracy across various matchers indicates our model’s strong generalization capability to different datasets and matchers.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"197 ","pages":"Pages 229-235"},"PeriodicalIF":3.3,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144896472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scale invariable hybrid-attention Generative Adversarial Network for Single Image Super Resolution","authors":"Khushboo Singla, Rajoo Pandey, Umesh Ghanekar","doi":"10.1016/j.patrec.2025.08.011","DOIUrl":"10.1016/j.patrec.2025.08.011","url":null,"abstract":"<div><div>It is well-known that the high-spectral components of an image cannot be truly recovered during the generation of high-resolution images. To address the issue, this paper proposes a network containing a hybrid attention block supported by Scale-Invariant Feature Transform to generate appropriate features in the super-resolved image. The block is designed to extract the multi-level and multi-attention features from the input image to achieve better contextual information in the output image with fewer parameters. Then, the Scale-Invariant Feature Transform is used as a loss function to preserve scale-invariant features from a low-resolution image, thereby reducing the gap between an output image and the ground truth image. Extensive simulations show that the proposed network outperforms many other existing models in terms of PSNR, SSIM, and LPIPS.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"197 ","pages":"Pages 236-242"},"PeriodicalIF":3.3,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144896473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Progressive feature injection and global–local discrimination-based image generation network for improving rail surface defect inspection","authors":"Dezhou Wang, Shixiang Su, Xiaobo Lu","doi":"10.1016/j.patrec.2025.08.004","DOIUrl":"10.1016/j.patrec.2025.08.004","url":null,"abstract":"<div><div>Rail surface defect inspection is an essential task for the railway system. Although current methods have achieved success to some extent, the inspection accuracy is still limited by the scarcity of defect samples. Expanding the training dataset is an intuitive idea to solve this problem. However, existing image generation methods are hard to generate samples with high quality and rich diversity at the same time. This paper proposes a two-step rail surface defect generation method, including background feature learning and defect foreground feature learning. Specifically, we first pretrain a background generation network by utilizing the abundant defect-free samples. Afterwards, we design a progressive defect feature injection branch to learn defect features within generated diverse mask regions and inject them into defect-free features. Meanwhile, a joint discriminator with global and local branches is developed to make the network focus on both the entire image and local irregular defect regions, and a lightweight match discriminator is introduced to ensure the rationality of defect masks. The experiments show that defect images generated by our method have better quality (achieving the lowest FID of 63.91 for scratch and 58.39 for crack) and richer diversity (obtaining the highest IS of 2.2803 for scratch and 2.5525 for crack). Additionally, our method achieves the best performance with 0.7019 and 0.7375 mIoU values for scratch and crack defects during the downstream inspection task.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"197 ","pages":"Pages 250-256"},"PeriodicalIF":3.3,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144896475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rahul Dasharath Gavas , Kriti Kumar , Achanna Anil Kumar , Soumya Kanti Ghosh , Arpan Pal
{"title":"Index for assessment of stationarity in spatiotemporal climatic signals","authors":"Rahul Dasharath Gavas , Kriti Kumar , Achanna Anil Kumar , Soumya Kanti Ghosh , Arpan Pal","doi":"10.1016/j.patrec.2025.07.025","DOIUrl":"10.1016/j.patrec.2025.07.025","url":null,"abstract":"<div><div>Recent advances in instrumentation and measurement technologies have significantly enhanced large-scale sensor signal acquisition and analysis. A crucial aspect of sensor signal processing is the characterization of these signals, with the assessment of stationarity being particularly important. Determining stationarity in time series signals presents considerable challenges, especially when the signals include a spatial component, resulting in complex spatiotemporal data. In this study, we introduce a method for quantitatively assessing non-stationarity in spatiotemporal climatic signals by modeling variations across spatial and temporal dimensions using the graph total variation technique. The results of simulation, along with real-world signal analysis, indicate that the proposed stationarity index effectively and reliably characterizes the stationarity of spatiotemporal climatic signals.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"197 ","pages":"Pages 215-221"},"PeriodicalIF":3.3,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144863757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}