{"title":"Enhancing visual adversarial transferability via affine transformation of intermediate-level perturbations","authors":"Qizhang Li , Yiwen Guo , Wangmeng Zuo","doi":"10.1016/j.patrec.2025.03.003","DOIUrl":"10.1016/j.patrec.2025.03.003","url":null,"abstract":"<div><div>The transferability of adversarial examples across deep neural networks (DNNs) provides an effective method for black-box attacks and poses a severe threat to the applications of DNNs. Recent studies show that making the intermediate-level perturbation (the difference between the intermediate representations of adversarial examples and their corresponding benign examples) less adversarial, <em>e.g.</em>, by reducing its magnitude, will improve the alignment of input gradients across substitute and victim models, thereby enhancing the transferability of adversarial examples. In this paper, we introduce an intermediate-level perturbation degradation framework that applies an affine transformation to the intermediate-level perturbation, enabling various degradation methods and thus improving the input gradient alignment. Experimental results show that our method outperforms existing state-of-the-arts on CIFAR-10, Food 101, Oxford-IIIT Pet, and ImageNet when attacking various victim models. Moreover, it can be combined with existing methods to achieve further improvements. Our code: <span><span>https://github.com/qizhangli/ILPD-plus-plus</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"191 ","pages":"Pages 51-57"},"PeriodicalIF":3.9,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143620245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Federated Knowledge Recycling: Privacy-preserving synthetic data sharing","authors":"Eugenio Lomurno, Matteo Matteucci","doi":"10.1016/j.patrec.2025.02.030","DOIUrl":"10.1016/j.patrec.2025.02.030","url":null,"abstract":"<div><div>Federated learning has emerged as a paradigm for collaborative learning, enabling the development of robust models without the need to centralise sensitive data. However, conventional federated learning techniques have privacy and security vulnerabilities due to the exposure of models, parameters or updates, which can be exploited as an attack surface. This paper presents Federated Knowledge Recycling (FedKR), a cross-silo federated learning approach that uses locally generated synthetic data to facilitate collaboration between institutions. FedKR combines advanced data generation techniques with a dynamic aggregation process to provide greater security against privacy attacks than existing methods, significantly reducing the attack surface. Experimental results on generic and medical datasets show that FedKR achieves competitive performance, with an average improvement in accuracy of 4.24% compared to training models from local data, demonstrating particular effectiveness in data scarcity scenarios.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"191 ","pages":"Pages 124-130"},"PeriodicalIF":3.9,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143643681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abd Errahmane Kiouche , Hamida Seba , Aymen Ourdjini
{"title":"A fast hybrid entropy-attribute diversity sampling based graph kernel","authors":"Abd Errahmane Kiouche , Hamida Seba , Aymen Ourdjini","doi":"10.1016/j.patrec.2025.02.031","DOIUrl":"10.1016/j.patrec.2025.02.031","url":null,"abstract":"<div><div>Graph kernels have become a cornerstone in the analysis of graph-structured data, offering powerful tools for similarity assessment in various domains. However, existing graph kernel methods often grapple with efficiently capturing both the structural complexity and attribute diversity inherent in graphs. This paper introduces the “Hybrid Entropy-Attribute Diversity Sampling Graph Kernel” (HEADS), a novel approach that synergizes entropy-based analysis with attribute-diversity-driven sampling to address these challenges. Our method leverages the Von Neumann entropy to quantify the informational content and complexity of graph structures, enhancing the expressiveness of the kernel. Additionally, we introduce an innovative attribute-diversity-driven snowball sampling technique, which ensures a comprehensive and representative selection of graph features. The integration of entropy measures with attribute diversity in our kernel computation marks a significant advancement in graph kernel analysis, paving the way for its application in large-scale, real-world graph data scenarios. This paper details the formulation of the HEADS approach, its algorithmic implementation, and an extensive evaluation demonstrating its efficacy in both computational performance and classification accuracy.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"191 ","pages":"Pages 89-95"},"PeriodicalIF":3.9,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143637170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chanda Grover Kamra , Indra Deep Mastan , Debayan Gupta
{"title":"ObjMST: Object-focused multimodal style transfer","authors":"Chanda Grover Kamra , Indra Deep Mastan , Debayan Gupta","doi":"10.1016/j.patrec.2025.02.033","DOIUrl":"10.1016/j.patrec.2025.02.033","url":null,"abstract":"<div><div>We propose ObjMST, an object-focused multimodal style transfer framework that provides separate style supervision for salient objects and surrounding elements while addressing alignment issues in multimodal representation learning. Existing image-text multimodal style transfer methods face the following challenges: (1) generating non-aligned and inconsistent multimodal style representations; and (2) content mismatch, where identical style patterns are applied to both salient objects and their surrounding elements. Our approach mitigates these issues by: (1) introducing a Style-Specific Masked Directional CLIP Loss, which ensures consistent and aligned style representations for both salient objects and their surroundings; and (2) incorporating a salient-to-key mapping mechanism for stylizing salient objects, followed by image harmonization to seamlessly blend the stylized objects with their environment. We validate the effectiveness of ObjMST through experiments, using both quantitative metrics and qualitative visual evaluations of the stylized outputs. Our code is available at: <span><span>https://github.com/chandagrover/ObjMST</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"191 ","pages":"Pages 66-72"},"PeriodicalIF":3.9,"publicationDate":"2025-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143628556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yidong Hu, Li Tong, Yuanlong Gao, Ying Zeng, Bin Yan, Zhongrui Li
{"title":"GazeViT: A gaze-guided hybrid attention vision transformer for cross-view matching of street-to-aerial images","authors":"Yidong Hu, Li Tong, Yuanlong Gao, Ying Zeng, Bin Yan, Zhongrui Li","doi":"10.1016/j.patrec.2025.03.012","DOIUrl":"10.1016/j.patrec.2025.03.012","url":null,"abstract":"<div><div>The goal of cross-view matching between street and aerial images is to retrieve aerial-view images that correspond to a given street-view image from a database of GPS-tagged aerial images. This task relies on image cross-view matching technology, focusing on the extraction and alignment of features representing the same location in both image types. The significant differences in perspective and appearance between street-view images and aerial-view images present a challenge. Aerial-view images cover a broader area, while street-view images focus on specific locations, creating information asymmetry that complicates the matching process. To tackle these challenges, this paper proposes a gaze-guided hybrid attention Vision Transformer, which uses gaze information to guide the model to focus on and align task-related features. Furthermore, inspired by the human visual cognitive process of \"focus and zoom,\" we develop a hybrid attention module alongside an image adaptive cropping and resolution enhancement module. The hybrid attention module utilizes gaze information to guide the model to focus on relevant regions, while the image adaptive cropping strategy uses gaze information to guide the model to eliminate irrelevant regions. Techniques for improving image resolution allow for the magnification of important regions, thereby aiding the extraction of fine-grained features. We evaluate the model's performance on benchmark datasets and conduct ablation study experiments to assess the contributions of each module. Experimental results show that the method achieves a top-1 accuracy of 75.56 % on the CVACT dataset, representing state-of-the-art performance. This study provides valuable insights into incorporating human experience into computational models, particularly through gaze-guided learning of visual task networks to enhance model performance.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"191 ","pages":"Pages 80-88"},"PeriodicalIF":3.9,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143637059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K.T. Yasas Mahima , Asanka G. Perera , Sreenatha Anavatti , Matt Garratt
{"title":"FlowCraft: Unveiling adversarial robustness of LiDAR scene flow estimation","authors":"K.T. Yasas Mahima , Asanka G. Perera , Sreenatha Anavatti , Matt Garratt","doi":"10.1016/j.patrec.2025.02.029","DOIUrl":"10.1016/j.patrec.2025.02.029","url":null,"abstract":"<div><div>With the arrival of deep learning and advanced sensor technologies, the autonomous vehicle domain has gained increased research interest. In particular, deep learning networks developed based on 3D LiDAR sensing data for perception and planning in autonomous vehicles demonstrate remarkable performance. However, recent research reveals vulnerabilities in LiDAR-based perception tasks, such as 3D object detection and segmentation, to intentionally crafted adversarial perturbations. Yet, the adversarial robustness of LiDAR-based regression tasks like scene flow estimation, remains largely unexplored. Therefore, this study introduces a novel point perturbation attack named FlowCraft, based on two loss functions, along with a critical analysis of selecting the adversarial objective against scene flow estimation. In particular, evaluations are conducted on trainable, runtime optimization, supervised, and self-supervised scene flow estimation methods using the Argoverse 2 and Waymo datasets in both black-box and white-box settings. Experimental results on the Argoverse 2 benchmark dataset and the DeFlow network show that FlowCraft achieves a relative endpoint error increment of 2.9, while demonstrating a higher endpoint error increase of 5.5 per unit change in Chamfer Distance compared to PGD and CosPGD attacks. Furthermore, our results demonstrate that the performance of point perturbation attacks against runtime optimization methods involves a trade-off between their success rate and overall imperceptibility.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"191 ","pages":"Pages 37-43"},"PeriodicalIF":3.9,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143593104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Summarized knowledge guidance for single-frame temporal action localization","authors":"Jinrong Sheng , Ao Li , Yongxin Ge","doi":"10.1016/j.patrec.2025.02.027","DOIUrl":"10.1016/j.patrec.2025.02.027","url":null,"abstract":"<div><div>Single-frame temporal action localization has garnered attention in the computer vision community. Existing methods address annotation sparsity by generating dense pseudo labels within individual videos, but disregard the variable representation from intra-class action instances, resulting in inferior completeness localization. In this paper, we propose to model intra-class relationships by using Summarized Knowledge Guidance (SKG). Specifically, we initially design a learnable memory bank to summarize annotated single-frame knowledge for each class. Then, we introduce two corresponding components, i.e., the knowledge propagation module (KPM) and the knowledge refinement module (KRM), for intra-class guidance. In KPM, we propagate summarized knowledge for feature-level enhancement through bipartite matching. In KRM, summarized knowledge is presented as confident pseudo positive samples for label-level refinement in a contrastive learning manner. Extensive experiments and ablation studies on the THUMOS14, GTEA and BEOID reveal that our method significantly outperforms state-of-the-art methods.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"191 ","pages":"Pages 31-36"},"PeriodicalIF":3.9,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143562378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lucas Fernando Alvarenga e Silva , Samuel Felipe dos Santos , Nicu Sebe , Jurandy Almeida
{"title":"Beyond the known: Enhancing Open Set Domain Adaptation with unknown exploration","authors":"Lucas Fernando Alvarenga e Silva , Samuel Felipe dos Santos , Nicu Sebe , Jurandy Almeida","doi":"10.1016/j.patrec.2024.12.010","DOIUrl":"10.1016/j.patrec.2024.12.010","url":null,"abstract":"<div><div>Convolutional neural networks (CNNs) can learn directly from raw data, resulting in exceptional performance across various research areas. However, factors present in non-controllable environments such as unlabeled datasets with varying levels of domain and category shift can reduce model accuracy. The Open Set Domain Adaptation (OSDA) is a challenging problem that arises when both of these issues occur together. Existing OSDA approaches in literature only align known classes or use supervised training to learn unknown classes as a single new category. In this work, we introduce a new approach to improve OSDA techniques by extracting a set of high-confidence unknown instances and using it as a hard constraint to tighten the classification boundaries. Specifically, we use a new loss constraint that is evaluated in three different ways: (1) using <em>pristine</em> negative instances directly; (2) using data augmentation techniques to create randomly <em>transformed</em> negatives; and (3) with <em>generated</em> synthetic negatives containing adversarial features. We analyze different strategies to improve the discriminator and the training of the Generative Adversarial Network (GAN) used to generate synthetic negatives. We conducted extensive experiments and analysis on OVANet using three widely-used public benchmarks, the Office-31, Office-Home, and VisDA datasets. We were able to achieve similar H-score to other state-of-the-art methods, while increasing the accuracy on unknown categories.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"189 ","pages":"Pages 265-272"},"PeriodicalIF":3.9,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143520023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Bernardete Ribeiro, Francisco Antunes, Dylan Perdigão, Catarina Silva
{"title":"Convolutional Spiking Neural Networks targeting learning and inference in highly imbalanced datasets","authors":"Bernardete Ribeiro, Francisco Antunes, Dylan Perdigão, Catarina Silva","doi":"10.1016/j.patrec.2024.08.002","DOIUrl":"10.1016/j.patrec.2024.08.002","url":null,"abstract":"<div><div>Spiking Neural Networks (SNNs) are regarded as the next frontier in AI, as they can be implemented on neuromorphic hardware, paving the way for advancements in real-world applications in the field. SNNs provide a biologically inspired solution that is event-driven, energy-efficient and sparse. While showing promising results, there are challenges that need to be addressed. For example, the design-build-evaluate process for integrating the architecture, learning, hyperparameter optimization and inference need to be tailored to a specific problem. This is particularly important in critical high-stakes industries such as finance services. In this paper, we present SpikeConv, a novel deep Convolutional Spiking Neural Network (CSNN), and investigate this process in the context of a highly imbalanced online bank account opening fraud problem. Our approach is compared with Deep Spiking Neural Networks (DSNNs) and Gradient Boosting Decision Trees (GBDT) showing competitive results.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"189 ","pages":"Pages 241-247"},"PeriodicalIF":3.9,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141946011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Guilherme F. Roberto , Danilo C. Pereira , Alessandro S. Martins , Thaína A.A. Tosta , Carlos Soares , Alessandra Lumini , Guilherme B. Rozendo , Leandro A. Neves , Marcelo Z. Nascimento
{"title":"Exploring percolation features with polynomial algorithms for classifying Covid-19 in chest X-ray images","authors":"Guilherme F. Roberto , Danilo C. Pereira , Alessandro S. Martins , Thaína A.A. Tosta , Carlos Soares , Alessandra Lumini , Guilherme B. Rozendo , Leandro A. Neves , Marcelo Z. Nascimento","doi":"10.1016/j.patrec.2024.07.022","DOIUrl":"10.1016/j.patrec.2024.07.022","url":null,"abstract":"<div><div>Covid-19 is a severe illness caused by the Sars-CoV-2 virus, initially identified in China in late 2019 and swiftly spreading globally. Since the virus primarily impacts the lungs, analyzing chest X-rays stands as a reliable and widely accessible means of diagnosing the infection. In computer vision, deep learning models such as CNNs have been the main adopted approach for detection of Covid-19 in chest X-ray images. However, we believe that handcrafted features can also provide relevant results, as shown previously in similar image classification challenges. In this study, we propose a method for identifying Covid-19 in chest X-ray images by extracting and classifying local and global percolation-based features. This technique was tested on three datasets: one comprising 2,002 segmented samples categorized into two groups (Covid-19 and Healthy); another with 1,125 non-segmented samples categorized into three groups (Covid-19, Healthy, and Pneumonia); and a third one composed of 4,809 non-segmented images representing three classes (Covid-19, Healthy, and Pneumonia). Then, 48 percolation features were extracted and give as input into six distinct classifiers. Subsequently, the AUC and accuracy metrics were assessed. We used the 10-fold cross-validation approach and evaluated lesion sub-types via binary and multiclass classification using the Hermite polynomial classifier, a novel approach in this domain. The Hermite polynomial classifier exhibited the most promising outcomes compared to five other machine learning algorithms, wherein the best obtained values for accuracy and AUC were 98.72% and 0.9917, respectively. We also evaluated the influence of noise in the features and in the classification accuracy. These results, based in the integration of percolation features with the Hermite polynomial, hold the potential for enhancing lesion detection and supporting clinicians in their diagnostic endeavors.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"189 ","pages":"Pages 248-255"},"PeriodicalIF":3.9,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142185786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}