Yuandong Min , Ruyi Xu , Jingying Chen , Yanfeng Ji , Xiaodi Liu
{"title":"Robust facial expression recognition by simultaneously addressing hard and mislabeled samples","authors":"Yuandong Min , Ruyi Xu , Jingying Chen , Yanfeng Ji , Xiaodi Liu","doi":"10.1016/j.patcog.2025.112026","DOIUrl":"10.1016/j.patcog.2025.112026","url":null,"abstract":"<div><div>Facial Expression Recognition (FER) in the wild is a challenging task, especially when training data contains numerous mislabeled samples and hard samples. Typically, FER models either overfit to the mislabeled samples or underfit to the hard samples, resulting in degraded performance. However, most existing methods fail to address both issues simultaneously. To overcome this limitation, this paper introduces a novel FER method called Noise-Hard robust Graph (NHG), which dynamically supervises the updating of the adjacency matrix in the Graph Convolutional Networks, striking a balance between suppressing the impacts of noisy labels and encouraging learning from hard samples. First, we map high-dimensional facial expression features onto low-dimensional manifolds to initialize the topological relationships between the samples, thus measuring the hard sample relationships more accurately. Second, we design a Label Consistency Mask (LCM) strategy to retain potential connections for hard sample learning. LCM could also potentially preserve correct connections while noisy labels exist, supporting noise-robust learning. Third, based on differences in trends of <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-norm variation between mislabeled samples and hard samples, the <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>2</mn></mrow></msub></math></span>-norm regularization (L2R) suppresses the learning of mislabeled samples while preserving the learning potential of hard samples and suppressing the propagation of their features within the graph. Experimental results demonstrate that our method achieves competitive performance compared to state-of-the-art methods in scenarios with noisy labels and hard samples.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112026"},"PeriodicalIF":7.5,"publicationDate":"2025-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144579264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing robustness and efficiency of least square twin SVM via granular computing","authors":"M. Tanveer, R.K. Sharma , A. Quadir, M. Sajid","doi":"10.1016/j.patcog.2025.112021","DOIUrl":"10.1016/j.patcog.2025.112021","url":null,"abstract":"<div><div>In the domain of machine learning, least square twin support vector machine (LSTSVM) stands out as one of the state-of-the-art classification model. However, LSTSVM is not without its limitations. It exhibits sensitivity to noise and outliers, fails to adequately incorporate the structural risk minimization (SRM) principle, and often demonstrates instability under resampling scenarios. Moreover, its computational complexity and reliance on matrix inversions hinder the efficient processing of large datasets. As a remedy to the aforementioned challenges, we propose the robust granular ball LSTSVM (GBLSTSVM). GBLSTSVM is trained using granular balls instead of original data points. The core of a granular ball is found at its center, where it encapsulates all the pertinent information of the data points within the ball of specified radius. To improve scalability and efficiency, we further introduce the large-scale GBLSTSVM (LS-GBLSTSVM), which incorporates the SRM principle through regularization terms. Experiments are performed on UCI, KEEL, and NDC benchmark dataset demonstrate that both the proposed GBLSTSVM and LS-GBLSTSVM models consistently outperform the baseline models. The source code of the proposed GBLSTSVM model is available at <span><span>https://github.com/mtanveer1/GBLSTSVM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112021"},"PeriodicalIF":7.5,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144589032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Structure anchor graph learning for multi-view clustering","authors":"Wei Guo , Zhe Wang , Wei Shao","doi":"10.1016/j.patcog.2025.111880","DOIUrl":"10.1016/j.patcog.2025.111880","url":null,"abstract":"<div><div>With the growth of data and diverse data sources, clustering large-scale multi-view data has emerged as a prominent topic in the field of machine learning. Anchor graph is an efficient strategy to improve the scalability of graph based multi-view clustering methods because it can capture the essence of the entire dataset by utilizing only a small set of representative anchor points. However, most existing anchor graph based methods encounter at least one of the following two challenges: the first one is the separation of anchor selection from the anchor graph construction process, while the second one is the requirement of an additional clustering step to generate the indicator matrix. Both of the separated steps can potentially lead to suboptimal solutions. In this paper, we propose structure anchor graph learning for multi-view clustering (SAGL), which jointly addresses the two challenges within a unified learning framework. Specifically, instead of utilizing the fixed anchors selected during the pre-processing step, SAGL jointly learns the consensus anchors in the latent space, and constructs anchor graph by assigning larger similarity values to sample-anchor pairs with shorter distances. Meanwhile, by manipulating the connected components of the anchor graph with rank constraint, SAGL obtains the anchor graph with clear cluster structure that can directly reveal the indicator of samples without any post-processing step. As a result, it becomes a truly one-stage end-to-end learning problem. In addition, a simple yet effective transformation is introduced to convert vector-sum-from to matrix-multiplication-form with trace operation, which leads an efficient optimization algorithm. Extensive experiments on several real-world multi-view datasets demonstrate the effectiveness and efficiency of the proposed methods over other state-of-the-art MvC methods.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111880"},"PeriodicalIF":7.5,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144548749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lijun Zhao , Yufeng Zhang , Xinlu Wang , Jinjing Zhang , Huihui Bai , Anhong Wang
{"title":"A survey on image compressive sensing: From classical theory to the latest explicable deep learning","authors":"Lijun Zhao , Yufeng Zhang , Xinlu Wang , Jinjing Zhang , Huihui Bai , Anhong Wang","doi":"10.1016/j.patcog.2025.112022","DOIUrl":"10.1016/j.patcog.2025.112022","url":null,"abstract":"<div><div>Deep learning has achieved significant advancements in both low-level and high-level computer vision tasks, which can also drive the development of an essential research field of Image Compressive Sensing (ICS) today and in the future. Nowadays model-inspired ICS reconstruction methods have gained considerable attention from researchers, resulting in numerous new developments. However, existing literature lacks a comprehensive summary of these advancements. To revitalize the field of ICS, it is crucial to summarize them to provide valuable insights for various other fields and practical applications. Thus, this article first looks back on foundational theories of ICS, including signal sparse representation, sampling and reconstruction. Next, we summarize different types of measurement matrices for sampling, which include learnable/non-learnable measurement matrix, uniform/non-uniform measurement matrix. Then, we provide a detailed review of ICS reconstruction, covering traditional optimization reconstruction methods, inexplicable reconstruction methods and explainable reconstruction methods as well as Transformer-based reconstruction methods, which will help readers quickly grasp the history of ICS development. We also evaluate several representative ICS reconstruction methods on publicly available datasets, comparing their performance and computational complexities to highlight their strengths and weaknesses. Finally, we conclude this paper and their future opportunities and challenges are prospected. All related materials can be found at <span><span>https://github.com/mdcnn/CS-Survey</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112022"},"PeriodicalIF":7.5,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144548746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yongsheng Dong , Chongchong Mao , Lintao Zheng , Qingtao Wu , Mingchuan Zhang , Xuelong Li
{"title":"AFPN: Alignment feature pyramid network for real-time semantic segmentation","authors":"Yongsheng Dong , Chongchong Mao , Lintao Zheng , Qingtao Wu , Mingchuan Zhang , Xuelong Li","doi":"10.1016/j.patcog.2025.112019","DOIUrl":"10.1016/j.patcog.2025.112019","url":null,"abstract":"<div><div>The structures of two pathways and the feature pyramid network (FPN) have achieved advanced performance in semantic segmentation. These two types of structures adopt different approaches to fuse low-level (shallow layer) spatial information and high-level (deep layer) semantic information. However, the segmentation results still lack local details due to the loss of information caused by simply fusing low-level feature details directly with multi-level deep features. To alleviate this problem, we propose an alignment feature pyramid network (AFPN) for real-time semantic segmentation. It can efficiently utilize both the low-level spatial information and high-level semantic information. Specifically, our AFPN consists of two components: the pooling enhancement attention block (PEAB) and the dual pooling alignment block (DPAB). The PEAB can effectively extract global information by using an aggregation pooling operation. The DPAB performs two types of pooling operations along the channel and spatial dimensions, reducing the differences between multi-scale feature maps. Extensive experiments show that AFPN achieves a better balance between accuracy and speed. On the Cityscapes, CamVid, and ADE20K datasets, AFPN achieves 78.75%, 79.24%, and 39.56% mIoU and the speed meets the real-time requirement. Our code can be available at the <span><span>https://github.com/chongchongmao/AFPN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112019"},"PeriodicalIF":7.5,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yunling Ma , Chaojun Zhang , Di Xiong , Han Zhang , Shihui Ying
{"title":"Multi-task dynamic graph learning for brain disorder identification with functional MRI","authors":"Yunling Ma , Chaojun Zhang , Di Xiong , Han Zhang , Shihui Ying","doi":"10.1016/j.patcog.2025.111922","DOIUrl":"10.1016/j.patcog.2025.111922","url":null,"abstract":"<div><div>Dynamic functional connectivity (FC) analysis based on resting-state functional magnetic resonance imaging (rs-fMRI) is widely used for automated diagnosis of brain disorders. A large number of dynamic FC analysis methods rely on sliding window techniques to extract time-varying features of brain activity from localized time periods. However, these methods are sensitive to window parameters and individual differences, leading to significant variability in the extracted features and impacting the stability and accuracy of disease classification. Additionally, while dynamic graph learning holds promise in modeling time-varying brain networks, existing methods still encounter difficulties in effectively capturing spatio-temporal dynamic information. Therefore, in this paper we propose a multi-task dynamic graph learning framework (MT-DGL) to align FC trajectories and learn the spatio-temporal dynamic information for brain disease recognition. The MT-DGL mainly includes three parts: (1) SPD-valued FC trajectory alignment module for overcoming the model’s dependence on sliding window parameters and mitigating the impact of asynchrony in execution rates across individuals, (2) Mamba-based multi-scale dynamic graph learning module for extracting spatio-temporal dynamic features from fMRI time series, and (3) multi-scale fusion and multi-task learning strategy to enhance the model’s understanding of age-related brain FC changes and improve the effectiveness of brain disorder identification. Experimental results indicate that the proposed method exhibits excellent performance in several publicly available fMRI datasets. Specifically, on the largest site in the ABIDE dataset, the accuracy and area under the curve reached 73.9% and 74.9%, respectively.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111922"},"PeriodicalIF":7.5,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144523906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improve ranking algorithms based on matrix factorization in rating systems","authors":"Shuyan Chen , Shengli Zhang , Gengzhong Zheng","doi":"10.1016/j.patcog.2025.112011","DOIUrl":"10.1016/j.patcog.2025.112011","url":null,"abstract":"<div><div>The proliferation of the Internet has led to an increase in the usage of rating systems. Inspired by matrix factorization, we present two improved iterative ranking algorithms called L1-AVG-RMF and AA-RMF for rating systems. In the new algorithms, the missing ratings are estimated by matrix factorization before applying traditional ranking algorithms. Theoretical analysis illustrates that the proposed algorithms have a better accuracy and robustness. And it is also demonstrated by Experiments with synthetic and real data. Additionally, experimental results also show that L1-AVG-RMF has superior effectiveness and robustness compared to some other ranking algorithms. Our findings emphasize the potential benefits of applying matrix factorization to ranking algorithms.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112011"},"PeriodicalIF":7.5,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144548747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Juanying Xie , Huan Yan , Mingzhao Wang , Philip W. Grant , Witold Pedrycz
{"title":"WANN-DPC: Density peaks finding clustering based on Weighted Adaptive Nearest Neighbors","authors":"Juanying Xie , Huan Yan , Mingzhao Wang , Philip W. Grant , Witold Pedrycz","doi":"10.1016/j.patcog.2025.111953","DOIUrl":"10.1016/j.patcog.2025.111953","url":null,"abstract":"<div><div>DPC (Density Peak Clustering) algorithm and most of its variants are unable to identify the cluster centers of dense and sparse clusters simultaneously. In addition, the “Domino Effect” of DPC cannot be entirely avoided in its variants. Despite ANN-DPC (Adaptive Nearest Neighbor DPC) being able to detect cluster centers of dense and sparse clusters, its adaptive nearest neighbors of a point may introduce bias in the local density, cluster centers and clustering. To address these limitations of ANN-DPC, the WANN-DPC (Weighted Adaptive Nearest Neighbor DPC) algorithm is proposed. The key contributions of WANN-DPC are as follows: (1) A novel weighted local density of a point is defined by weighting its close and far neighbors, (2) a correction factor is proposed to detect cluster centers in turn, and (3) a two-step assignment strategy is presented utilizing nearest neighbor relationships and weighted membership degrees. Extensive experiments on benchmark datasets demonstrate the superiority of the WANN-DPC over its peers.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111953"},"PeriodicalIF":7.5,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144535801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning region-aware style-content feature transformations for face image beautification","authors":"Zhen Xu, Si Wu","doi":"10.1016/j.patcog.2025.111861","DOIUrl":"10.1016/j.patcog.2025.111861","url":null,"abstract":"<div><div>As a representative image-to-image translation task, facial makeup transfer is typically performed by applying intermediate feature normalization, conditioned on the style information extracted from a reference image. However, the relevant methods are typically limited in range of applicability, due to that the style information is independent of source images and lack of spatial details. To realize precise makeup transfer and further associate with face component editing, we propose a Semantic Region Style-content Feature Transformation approach, which is referred to as SRSFT. Specifically, we encode both reference and source images into region-wise feature vectors and maps, based on semantic segmentation masks. To address the misalignment in poses and expressions, region-wise spatial transformations are inferred to align the reference and source masks, and are then applied to explicitly warp the reference feature maps to the source face, without any extra supervision. The resulting feature maps are fused with the source ones and inserted into a generator for image synthesis. On the other hand, the reference and source feature vectors are also fused and used to determine the modulation parameters at multiple intermediate layers. SRSFT is able to achieve superior beautification performance in terms of seamlessness and fidelity.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111861"},"PeriodicalIF":7.5,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144518994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Francesco Manigrasso, Fabrizio Lamberti, Lia Morra
{"title":"Boosting zero-shot learning through neuro-symbolic integration","authors":"Francesco Manigrasso, Fabrizio Lamberti, Lia Morra","doi":"10.1016/j.patcog.2025.111869","DOIUrl":"10.1016/j.patcog.2025.111869","url":null,"abstract":"<div><div>Zero-shot learning (ZSL) aims to train deep neural networks to recognize objects from unseen classes, starting from a semantic description of the concepts. Neuro-symbolic (NeSy) integration refers to a class of techniques that incorporate symbolic knowledge representation and reasoning with the learning capabilities of deep neural networks. However, to date, few studies have explored how to leverage NeSy techniques to inject prior knowledge during the training process to boost ZSL capabilities. Here, we present Fuzzy Logic Prototypical Network (FLPN) that formulates the classification task as prototype matching in a visual-semantic embedding space, which is trained by optimizing a NeSy loss. Specifically, FLPN exploits the Logic Tensor Network (LTN) framework to incorporate background knowledge in the form of logical axioms by grounding a first-order logic language as differentiable operations between real tensors. This prior knowledge includes class hierarchies (classes and macroclasses) along with robust high-level inductive biases. The latter allow, for instance, to handle exceptions in class-level attributes and to enforce similarity between images of the same class, preventing premature overfitting to seen classes and improving overall performance. Both class-level and attribute-level prototypes through an attention mechanism specialized for either convolutional- or transformer-based backbones. FLPN achieves state-of-the-art performance on the GZSL benchmarks AWA2 and SUN, matching or exceeding the performance of competing algorithms with minimal computational overhead. The code is available at <span><span>https://github.com/FrancescoManigrass/FLPN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111869"},"PeriodicalIF":7.5,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144580517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}