{"title":"All-in-one weather removal via Multi-Depth Gated Transformer with gradient modulation","authors":"Xiang Li, Jianwu Li","doi":"10.1016/j.patcog.2025.111643","DOIUrl":"10.1016/j.patcog.2025.111643","url":null,"abstract":"<div><div>All-in-one weather removal methods have made impressive progress recently, but their ability to recover finer details from degraded images still needs to be improved, since (1) the difficulty of Convolutional Neural Networks (CNNs) in providing long-distance information interaction or Visual Transformer with simple convolutions in extracting richer local details, makes them unable to effectively utilize similar original texture features in different regions of a degraded image, and (2) under complex weather degradation distributions, their pixel reconstruction loss functions often result in losing high-frequency details in restored images, even when perceptual loss is used. In this paper, we propose a Multi-Depth Gated Transformer Network (MDGTNet) for all-in-one weather removal, with (1) a multi-depth gated module to capture richer background texture details from various weather noises in an input-adaptive manner, (2) self-attentions to reconstruct similar background textures via long-range feature interaction, and (3) a novel Adaptive Smooth <span><math><msub><mrow><mtext>L</mtext></mrow><mrow><mn>1</mn></mrow></msub></math></span>\u0000 (<span><math><msub><mrow><mtext>ASL</mtext></mrow><mrow><mn>1</mn></mrow></msub></math></span>) loss based on gradient modulation to prompt finer detail restoration. Experimental results show that our method achieves superior performance on both synthetic and real-world benchmarks. Source code is available at <span><span>https://github.com/xiangLi-bit/MDGTNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"165 ","pages":"Article 111643"},"PeriodicalIF":7.5,"publicationDate":"2025-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuxuan Luo , Jinpeng Chen , Runmin Cong , Horace Ho Shing Ip , Sam Kwong
{"title":"Trace Back and Go Ahead: Completing partial annotation for continual semantic segmentation","authors":"Yuxuan Luo , Jinpeng Chen , Runmin Cong , Horace Ho Shing Ip , Sam Kwong","doi":"10.1016/j.patcog.2025.111613","DOIUrl":"10.1016/j.patcog.2025.111613","url":null,"abstract":"<div><div>Existing Continual Semantic Segmentation (CSS) methods effectively address the issue of <em>background shift</em> in regular training samples. However, this issue persists in exemplars, <em>i.e.</em>, replay samples, which is often overlooked. Each exemplar is annotated only with the classes from its originating task, while other past classes and the current classes during replay are labeled as <em>background</em>. This partial annotation can erase the network’s knowledge of previous classes and impede the learning of new classes. To resolve this, we introduce a new method named Trace Back and Go Ahead (TAGA), which utilizes a backward annotator model and a forward annotator model to generate pseudo-labels for both regular training samples and exemplars, aiming at reducing the adverse effects of incomplete annotations. This approach effectively mitigates the risk of incorrect guidance from both sample types, offering a comprehensive solution to <em>background shift</em>. Additionally, due to a significantly smaller number of exemplars compared to regular training samples, the class distribution in the sample pool of each incremental task exhibits a long-tailed pattern, potentially biasing classification towards incremental classes. Consequently, TAGA incorporates a class-equilibrium sampling strategy that adaptively adjusts the sampling frequencies based on the ratios of exemplars to regular samples and past to new classes, counteracting the skewed distribution. Extensive experiments on two public datasets, Pascal VOC 2012 and ADE20K, demonstrate that our method surpasses state-of-the-art methods.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"165 ","pages":"Article 111613"},"PeriodicalIF":7.5,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143768429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Weicheng Xie , Fan Yang , Junliang Zhang , Siyang Song , Linlin Shen , Zitong Yu , Cheng Luo
{"title":"SymGraphAU: Prior knowledge based symbolic graph for action unit recognition","authors":"Weicheng Xie , Fan Yang , Junliang Zhang , Siyang Song , Linlin Shen , Zitong Yu , Cheng Luo","doi":"10.1016/j.patcog.2025.111640","DOIUrl":"10.1016/j.patcog.2025.111640","url":null,"abstract":"<div><div>The prior and sample-aware semantic association between facial Action Units (AUs) and expressions, which could yield insightful cues for the recognition of AUs, remains underexplored within the existing body of literature. In this paper, we introduce a novel AU recognition method to explicitly explore both AUs and Expressions, incorporating existing knowledge about their relationships. Specifically, we novelly use the Conjunctive Normal Form (CNF) in propositional logic to express these knowledges. Thanks to the flexible and explainable logic proposition, our method can dynamically build a knowledge base specifically for each sample, which is not limited to fixed prior knowledge pattern. Furthermore, a new regularization mechanism is introduced to learn the predefined rules of logical knowledge based on embedding graph convolutional networks. Extensive experiments show that our approach can outperform current state-of-the-art AU recognition methods on the BP4D and DISFA datasets. Our codes will be made publicly available.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"165 ","pages":"Article 111640"},"PeriodicalIF":7.5,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Deng Pan , Yingyue Zhang , Shenglong Liu , Zhihong Zhang
{"title":"Advancing evolution characterization in dynamic networks: A quantum walk and thermodynamics perspective","authors":"Deng Pan , Yingyue Zhang , Shenglong Liu , Zhihong Zhang","doi":"10.1016/j.patcog.2025.111630","DOIUrl":"10.1016/j.patcog.2025.111630","url":null,"abstract":"<div><div>Dynamic networks are useful models for representing changing systems, such as social and trading networks. Accurately characterizing the evolution states of these networks is crucial for effective representation learning and future link prediction. However, existing methods struggle to distinguish between similar local evolution states, providing the same embedding for nodes in similar but distinct local graph topologies. In addition, previous methods fail to capture the change of overall topology over time, i.e. global evolution state, which limits these methods to a local or static structure. To address these limitations, we propose a novel framework called <strong>EANQWT</strong> (<strong>E</strong>volution <strong>A</strong>ware <strong>N</strong>etwork with <strong>Q</strong>uantum <strong>W</strong>alk and <strong>T</strong>hermodynamics). During the encoding phase, the framework utilizes the average results of continuous-time quantum walks as quantum migration probabilities to differentiate between similar local evolution states. Additionally, EANQWT integrates an innovative Multi-view Thermodynamic Mixture of Experts (MTMoE) decoder, which considers quantum thermodynamic temperatures, a measurement of changes in the entire graph, from multiple perspectives to determine the existence of links between nodes. Our experimental results in both transductive and inductive settings show that EANQWT either surpasses or matches various state-of-the-art baselines.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"165 ","pages":"Article 111630"},"PeriodicalIF":7.5,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Partial consistent adversarial unified framework for unsupervised non-contrast CT cross-domain adaptation and segmentation","authors":"Qikui Zhu , Yanqing Wang , Shaoming Zhu , Bo Du","doi":"10.1016/j.patcog.2025.111638","DOIUrl":"10.1016/j.patcog.2025.111638","url":null,"abstract":"<div><div>If kidney tumors can be segmented using non-contrast computed tomography (NC-CT) imaging alone, it would represent a major clinical advancement. Yet, to our best knowledge, existing kidney tumor segmentation methods rely heavily on high-resolution, high-sensitivity Contrast-Enhanced Computed Tomography (CE-CT) imaging due to the challenges of complex anatomy and morphology and weak boundaries of tumors. Moreover, these methods typically treat domain adaptation and segmentation as <em>Separate Steps</em>, with a primary focus on global domain adaptation, lacking the ability to prioritize tumor. To address above challenges, we propose a novel Unified Cross-domain Adaptation and Segmentation (UCAS) framework for unsupervised NC-CT kidney tumor adaptation and segmentation. Specifically, (1) UCAS integrates domain adaptation and segmentation into a novel unified framework, making the domain adaptation tumor-sensitive and addressing the limitations in focusing on tumor semantics during domain adaptation. (2) To bridge domain adaptation and segmentation, a novel Partial Consistent learning is proposed that leverages the partial semantics of tumors from synthetic CE-CT imaging to directly guide the domain adaptation. Furthermore, to address low-contrast and low-sensitivity challenges of tumor in NC-CT imaging, (3) a novel Partial Adversarial Learning is proposed to maintain the consistency between CE-CT and synthetic CE-CT imaging within local tumor semantic space. Moreover, to eliminate the semantic gap between the adaptation and segmentation tasks, (4) an effective Semantic Contrastive Learning aligns the domain adaptation and segmentation. The experimental results evaluated from segmentation and domain adaptation affirm that UCAS can segment kidney tumors from NC-CT imaging, outperforming state-of-the-art cross-domain segmentation methods and demonstrating significant clinical value.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"165 ","pages":"Article 111638"},"PeriodicalIF":7.5,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143799051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Die Hu , Yanbei Liu , Xiao Wang , Lei Geng , Fang Zhang , Zhitao Xiao , Jerry Chun-Wei Lin
{"title":"Multi-information Fusion Graph Convolutional Network for cancer driver gene identification","authors":"Die Hu , Yanbei Liu , Xiao Wang , Lei Geng , Fang Zhang , Zhitao Xiao , Jerry Chun-Wei Lin","doi":"10.1016/j.patcog.2025.111619","DOIUrl":"10.1016/j.patcog.2025.111619","url":null,"abstract":"<div><div>Cancer is generally thought to be caused by the accumulation of mutations in driver genes. The identification of cancer driver genes is crucial for cancer research, diagnosis and treatment. Despite existing methods, challenges remain in comprehensively learning of the attributes and intricate interactions of genetic data. We propose a novel Multi-information Fusion Graph Convolutional Network (MF-GCN) for cancer driver gene identification, based on multi-omics pan-cancer data and Gene Regulatory Network (GRN) data. Directed topological and attribute graph networks learn gene interactions and self-attribute information, while a common graph network captures consistency between topology and attributes. An attention mechanism adaptively fuses these information with importance weights to identify cancer driver genes. Experimental results showed that MF-GCN can effectively identify cancer driver genes across three GRN datasets, with AUROC and AUPRC improvements of 2.66% and 2.69%, respectively, compared with the state-of-the-art approaches.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"165 ","pages":"Article 111619"},"PeriodicalIF":7.5,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143792210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tianheng Zhang , Jianli Zhao , Sheng Fang , Zhe Li , Qing Zhang , Maoguo Gong
{"title":"Hyperspectral image restoration via the collaboration of low-rank tensor denoising and completion","authors":"Tianheng Zhang , Jianli Zhao , Sheng Fang , Zhe Li , Qing Zhang , Maoguo Gong","doi":"10.1016/j.patcog.2025.111629","DOIUrl":"10.1016/j.patcog.2025.111629","url":null,"abstract":"<div><div>Hyperspectral images (HSIs) are always damaged by various types of noise during acquisition and transmission. Low-rank tensor denoising methods have achieved state-of-the-art results in current HSIs restoration tasks. However, all these methods remove the mixed noise in HSI based on the representation of image prior information. In this paper, we consider a problem for the first time: Structured noise like stripes and deadlines confounds image priors, hindering effective image-noise separation in current approaches. Motivated by this, a new HSI restoration model based on the collaboration of low-rank tensor denoising and completion (LR-TDTC) is proposed. Firstly, the structured noise detection algorithm is applied to identify the positions of structured noise such as stripes and deadlines, achieving the separation of unstructured noise and structured noise. The entries in the structured noisy area are removed. Then, for unstructured noise, a tensor denoising module (TD) based on image prior representation is introduced to separate images and noise. For structured noise, a tensor completion module (TC) based on full-mode-augmentation tensor train rank minimization is introduced to complete the noise area. Finally, the two modules collaborate through the mutual utilization of information to achieve the restoration of the entire image. To solve the LR-TDTC model, a variable tessellation iterative algorithm (VTI) is proposed. VTI utilizes a serialization strategy to enable TD and TC modules to effectively utilize each other's latest iteration results, achieving efficient collaboration between the two. The mixed noise removal experiments on multiple HSIs show that the proposed method has outstanding advantages.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"165 ","pages":"Article 111629"},"PeriodicalIF":7.5,"publicationDate":"2025-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143768428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jicheng Li , Yongjian Hu , Beibei Liu , Huimin She , Chang-Tsun Li
{"title":"Deepfake detection with domain generalization and mask-guided supervision","authors":"Jicheng Li , Yongjian Hu , Beibei Liu , Huimin She , Chang-Tsun Li","doi":"10.1016/j.patcog.2025.111622","DOIUrl":"10.1016/j.patcog.2025.111622","url":null,"abstract":"<div><div>Most existing deepfake (video face forgery) detectors work well in intra-dataset testing, but their performance degrades severely in cross-dataset testing. Cross-dataset generalization remains a major challenge. Since domain generalization (DG) aims to learn domain-invariant features while suppressing domain specific features, we propose a DG framework for improving face forgery detection in this study. Our detector consists of two modules. The first module learns both spatial and spectral features from frame images. The second one learns high-level feature patterns from the outputs of the first module, and constructs the classification features with the help of face mask-guided supervision. The classification result is fine-tuned by a confidence-based correction mechanism. The DG framework is realized through a bi-level optimization process. Extensive experiments demonstrate that our detector works effectively in both intra- and cross-dataset testing. Compared with 8 typical methods, it has the best overall performance and the highest robustness against common perturbations.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"165 ","pages":"Article 111622"},"PeriodicalIF":7.5,"publicationDate":"2025-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143768432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Congcong Jia , Xingbo Dong , Yen Lung Lai , Andrew Beng Jin Teoh , Ziyuan Yang , Xiaoyan Zhang , Liwen Wang , Zhe Jin , Lianqiang Yang
{"title":"Single source domain generalization for palm biometrics","authors":"Congcong Jia , Xingbo Dong , Yen Lung Lai , Andrew Beng Jin Teoh , Ziyuan Yang , Xiaoyan Zhang , Liwen Wang , Zhe Jin , Lianqiang Yang","doi":"10.1016/j.patcog.2025.111620","DOIUrl":"10.1016/j.patcog.2025.111620","url":null,"abstract":"<div><div>In palmprint recognition, domain shifts caused by device differences and environmental variations presents a significant challenge. Existing approaches often require multiple source domains for effective domain generalization (DG), limiting their applicability in single-source domain scenarios. To address this challenge, we propose PalmRSS, a novel Palm Recognition approach based on Single Source Domain Generalization (SSDG). PalmRSS reframes the SSDG problem as a DG problem by partitioning the source domain dataset into subsets and employing image alignment and adversarial training. PalmRSS exchanges low-level frequencies of palm data and performs histogram matching between samples to align spectral characteristics and pixel intensity distributions. Experiments demonstrate that PalmRSS outperforms state-of-the-art methods, highlighting its effectiveness in single source domain generalization. The code is released at <span><span>https://github.com/yocii/PalmRSS</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"165 ","pages":"Article 111620"},"PeriodicalIF":7.5,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143746199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A novel heterogeneous data classification approach combining gradient boosting decision trees and hybrid structure model","authors":"Feng Xu , Yuting Huang , Hui Wang , Zizhu Fan","doi":"10.1016/j.patcog.2025.111614","DOIUrl":"10.1016/j.patcog.2025.111614","url":null,"abstract":"<div><div>Graph neural network (GNN) is crucial in graph representation learning tasks. However, when the feature of graph network nodes is complex, such as those originating from heterogeneous data or multi-view data, graph neural network methods encounter difficulties. It is well known that gradient boosting decision trees (GBDT) excel at handling heterogeneous tabular data, while GNN and HGNN perform well with low-order and high-order sparse matrices. Therefore, we propose a method that combines their strengths by using GBDT to handle heterogeneous features, while a hybrid structured model (HSM) based on GNN and hypergraph neural network (HGNN), which can effectively capture both low-order and high-order information, backpropagates gradients to the GBDT. The proposed GBDT-HSM algorithm performs well on four structured tabular datasets and two multi-view datasets. It achieves state-of-the-art performance, showcasing its potential in addressing the challenges of heterogeneous data classification. The code is available at <span><span>https://github.com/zzfan3/GBDT-HSM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"165 ","pages":"Article 111614"},"PeriodicalIF":7.5,"publicationDate":"2025-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143746200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}