Pattern Recognition最新文献

筛选
英文 中文
AniFaceDiff: Animating stylized avatars via parametric conditioned diffusion models AniFaceDiff:通过参数化条件扩散模型动画化头像
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-28 DOI: 10.1016/j.patcog.2025.112017
Ken Chen , Sachith Seneviratne , Wei Wang , Dongting Hu , Sanjay Saha , Md. Tarek Hasan , Sanka Rasnayaka , Tamasha Malepathirana , Mingming Gong , Saman Halgamuge
{"title":"AniFaceDiff: Animating stylized avatars via parametric conditioned diffusion models","authors":"Ken Chen ,&nbsp;Sachith Seneviratne ,&nbsp;Wei Wang ,&nbsp;Dongting Hu ,&nbsp;Sanjay Saha ,&nbsp;Md. Tarek Hasan ,&nbsp;Sanka Rasnayaka ,&nbsp;Tamasha Malepathirana ,&nbsp;Mingming Gong ,&nbsp;Saman Halgamuge","doi":"10.1016/j.patcog.2025.112017","DOIUrl":"10.1016/j.patcog.2025.112017","url":null,"abstract":"<div><div>Animating stylized head avatars with dynamic poses and expressions has become an important focus in recent research due to its broad range of applications (e.g. VR/AR, film and animation, privacy protection). Previous research has made significant progress by training controllable generative models to animate the reference avatar using the target pose and expression. However, existing portrait animation methods are mostly trained using human faces, making them struggle to generalize to stylized avatar references such as cartoon, painting, and 3D-rendered avatars. Moreover, the mechanisms used to animate avatars – namely, to control the pose and expression of the reference – often inadvertently introduce unintended features – such as facial shape – from the target, while also causing a loss of intended features, like expression-related details. This paper proposes AniFaceDiff, a Stable Diffusion based method with a new conditioning module for animating stylized avatars. First, we propose a refined spatial conditioning approach by Facial Alignment to minimize identity mismatches, particularly between stylized avatars and human faces. Then, we introduce an Expression Adapter that incorporates additional cross-attention layers to address the potential loss of expression-related information. Extensive experiments demonstrate that our method achieves state-of-the-art performance, particularly in the most challenging out-of-domain stylized avatar animation, i.e., domains unseen during training. It delivers superior image quality, identity preservation, and expression accuracy. This work enhances the quality of virtual stylized avatar animation for constructive and responsible applications. To promote ethical use in virtual environments, we contribute to the advancement of face manipulation detection by evaluating state-of-the-art detectors, highlighting potential areas for improvement, and suggesting solutions.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112017"},"PeriodicalIF":7.5,"publicationDate":"2025-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144548744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep matrix factorization with adaptive weights for multi-view clustering 基于自适应权值的深度矩阵分解多视图聚类
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-28 DOI: 10.1016/j.patcog.2025.112027
Yasser Khalafaoui , Basarab Matei , Martino Lovisetto , Nistor Grozavu
{"title":"Deep matrix factorization with adaptive weights for multi-view clustering","authors":"Yasser Khalafaoui ,&nbsp;Basarab Matei ,&nbsp;Martino Lovisetto ,&nbsp;Nistor Grozavu","doi":"10.1016/j.patcog.2025.112027","DOIUrl":"10.1016/j.patcog.2025.112027","url":null,"abstract":"<div><div>Recently, deep matrix factorization has been established as a powerful model for unsupervised tasks, achieving promising results, especially for multi-view clustering. However, existing methods often lack effective feature selection mechanisms and rely on empirical hyperparameter selection. To address these issues, we introduce a novel Deep Matrix Factorization with Adaptive Weights for Multi-View Clustering (DMFAW). Our method simultaneously incorporates feature selection and generates local partitions, enhancing clustering results. The feature weights are driven by a single, control-theory-inspired parameter that is updated dynamically, which improves stability and speeds convergence. A late fusion approach is then proposed to align the weighted local partitions with the consensus partition. Finally, the optimization problem is solved via an alternating optimization algorithm with theoretically guaranteed convergence. Extensive experiments on benchmark datasets highlight that DMFAW outperforms state-of-the-art methods in terms of clustering performance.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112027"},"PeriodicalIF":7.5,"publicationDate":"2025-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structure anchor graph learning for multi-view clustering 多视图聚类的结构锚图学习
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-27 DOI: 10.1016/j.patcog.2025.111880
Wei Guo , Zhe Wang , Wei Shao
{"title":"Structure anchor graph learning for multi-view clustering","authors":"Wei Guo ,&nbsp;Zhe Wang ,&nbsp;Wei Shao","doi":"10.1016/j.patcog.2025.111880","DOIUrl":"10.1016/j.patcog.2025.111880","url":null,"abstract":"<div><div>With the growth of data and diverse data sources, clustering large-scale multi-view data has emerged as a prominent topic in the field of machine learning. Anchor graph is an efficient strategy to improve the scalability of graph based multi-view clustering methods because it can capture the essence of the entire dataset by utilizing only a small set of representative anchor points. However, most existing anchor graph based methods encounter at least one of the following two challenges: the first one is the separation of anchor selection from the anchor graph construction process, while the second one is the requirement of an additional clustering step to generate the indicator matrix. Both of the separated steps can potentially lead to suboptimal solutions. In this paper, we propose structure anchor graph learning for multi-view clustering (SAGL), which jointly addresses the two challenges within a unified learning framework. Specifically, instead of utilizing the fixed anchors selected during the pre-processing step, SAGL jointly learns the consensus anchors in the latent space, and constructs anchor graph by assigning larger similarity values to sample-anchor pairs with shorter distances. Meanwhile, by manipulating the connected components of the anchor graph with rank constraint, SAGL obtains the anchor graph with clear cluster structure that can directly reveal the indicator of samples without any post-processing step. As a result, it becomes a truly one-stage end-to-end learning problem. In addition, a simple yet effective transformation is introduced to convert vector-sum-from to matrix-multiplication-form with trace operation, which leads an efficient optimization algorithm. Extensive experiments on several real-world multi-view datasets demonstrate the effectiveness and efficiency of the proposed methods over other state-of-the-art MvC methods.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111880"},"PeriodicalIF":7.5,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144548749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A survey on image compressive sensing: From classical theory to the latest explicable deep learning 图像压缩感知综述:从经典理论到最新的可解释深度学习
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-27 DOI: 10.1016/j.patcog.2025.112022
Lijun Zhao , Yufeng Zhang , Xinlu Wang , Jinjing Zhang , Huihui Bai , Anhong Wang
{"title":"A survey on image compressive sensing: From classical theory to the latest explicable deep learning","authors":"Lijun Zhao ,&nbsp;Yufeng Zhang ,&nbsp;Xinlu Wang ,&nbsp;Jinjing Zhang ,&nbsp;Huihui Bai ,&nbsp;Anhong Wang","doi":"10.1016/j.patcog.2025.112022","DOIUrl":"10.1016/j.patcog.2025.112022","url":null,"abstract":"<div><div>Deep learning has achieved significant advancements in both low-level and high-level computer vision tasks, which can also drive the development of an essential research field of Image Compressive Sensing (ICS) today and in the future. Nowadays model-inspired ICS reconstruction methods have gained considerable attention from researchers, resulting in numerous new developments. However, existing literature lacks a comprehensive summary of these advancements. To revitalize the field of ICS, it is crucial to summarize them to provide valuable insights for various other fields and practical applications. Thus, this article first looks back on foundational theories of ICS, including signal sparse representation, sampling and reconstruction. Next, we summarize different types of measurement matrices for sampling, which include learnable/non-learnable measurement matrix, uniform/non-uniform measurement matrix. Then, we provide a detailed review of ICS reconstruction, covering traditional optimization reconstruction methods, inexplicable reconstruction methods and explainable reconstruction methods as well as Transformer-based reconstruction methods, which will help readers quickly grasp the history of ICS development. We also evaluate several representative ICS reconstruction methods on publicly available datasets, comparing their performance and computational complexities to highlight their strengths and weaknesses. Finally, we conclude this paper and their future opportunities and challenges are prospected. All related materials can be found at <span><span>https://github.com/mdcnn/CS-Survey</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112022"},"PeriodicalIF":7.5,"publicationDate":"2025-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144548746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AFPN: Alignment feature pyramid network for real-time semantic segmentation 面向实时语义分割的对齐特征金字塔网络
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-26 DOI: 10.1016/j.patcog.2025.112019
Yongsheng Dong , Chongchong Mao , Lintao Zheng , Qingtao Wu , Mingchuan Zhang , Xuelong Li
{"title":"AFPN: Alignment feature pyramid network for real-time semantic segmentation","authors":"Yongsheng Dong ,&nbsp;Chongchong Mao ,&nbsp;Lintao Zheng ,&nbsp;Qingtao Wu ,&nbsp;Mingchuan Zhang ,&nbsp;Xuelong Li","doi":"10.1016/j.patcog.2025.112019","DOIUrl":"10.1016/j.patcog.2025.112019","url":null,"abstract":"<div><div>The structures of two pathways and the feature pyramid network (FPN) have achieved advanced performance in semantic segmentation. These two types of structures adopt different approaches to fuse low-level (shallow layer) spatial information and high-level (deep layer) semantic information. However, the segmentation results still lack local details due to the loss of information caused by simply fusing low-level feature details directly with multi-level deep features. To alleviate this problem, we propose an alignment feature pyramid network (AFPN) for real-time semantic segmentation. It can efficiently utilize both the low-level spatial information and high-level semantic information. Specifically, our AFPN consists of two components: the pooling enhancement attention block (PEAB) and the dual pooling alignment block (DPAB). The PEAB can effectively extract global information by using an aggregation pooling operation. The DPAB performs two types of pooling operations along the channel and spatial dimensions, reducing the differences between multi-scale feature maps. Extensive experiments show that AFPN achieves a better balance between accuracy and speed. On the Cityscapes, CamVid, and ADE20K datasets, AFPN achieves 78.75%, 79.24%, and 39.56% mIoU and the speed meets the real-time requirement. Our code can be available at the <span><span>https://github.com/chongchongmao/AFPN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112019"},"PeriodicalIF":7.5,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-task dynamic graph learning for brain disorder identification with functional MRI 多任务动态图学习在脑功能磁共振识别中的应用
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-26 DOI: 10.1016/j.patcog.2025.111922
Yunling Ma , Chaojun Zhang , Di Xiong , Han Zhang , Shihui Ying
{"title":"Multi-task dynamic graph learning for brain disorder identification with functional MRI","authors":"Yunling Ma ,&nbsp;Chaojun Zhang ,&nbsp;Di Xiong ,&nbsp;Han Zhang ,&nbsp;Shihui Ying","doi":"10.1016/j.patcog.2025.111922","DOIUrl":"10.1016/j.patcog.2025.111922","url":null,"abstract":"<div><div>Dynamic functional connectivity (FC) analysis based on resting-state functional magnetic resonance imaging (rs-fMRI) is widely used for automated diagnosis of brain disorders. A large number of dynamic FC analysis methods rely on sliding window techniques to extract time-varying features of brain activity from localized time periods. However, these methods are sensitive to window parameters and individual differences, leading to significant variability in the extracted features and impacting the stability and accuracy of disease classification. Additionally, while dynamic graph learning holds promise in modeling time-varying brain networks, existing methods still encounter difficulties in effectively capturing spatio-temporal dynamic information. Therefore, in this paper we propose a multi-task dynamic graph learning framework (MT-DGL) to align FC trajectories and learn the spatio-temporal dynamic information for brain disease recognition. The MT-DGL mainly includes three parts: (1) SPD-valued FC trajectory alignment module for overcoming the model’s dependence on sliding window parameters and mitigating the impact of asynchrony in execution rates across individuals, (2) Mamba-based multi-scale dynamic graph learning module for extracting spatio-temporal dynamic features from fMRI time series, and (3) multi-scale fusion and multi-task learning strategy to enhance the model’s understanding of age-related brain FC changes and improve the effectiveness of brain disorder identification. Experimental results indicate that the proposed method exhibits excellent performance in several publicly available fMRI datasets. Specifically, on the largest site in the ABIDE dataset, the accuracy and area under the curve reached 73.9% and 74.9%, respectively.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111922"},"PeriodicalIF":7.5,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144523906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improve ranking algorithms based on matrix factorization in rating systems 改进评级系统中基于矩阵分解的排名算法
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-26 DOI: 10.1016/j.patcog.2025.112011
Shuyan Chen , Shengli Zhang , Gengzhong Zheng
{"title":"Improve ranking algorithms based on matrix factorization in rating systems","authors":"Shuyan Chen ,&nbsp;Shengli Zhang ,&nbsp;Gengzhong Zheng","doi":"10.1016/j.patcog.2025.112011","DOIUrl":"10.1016/j.patcog.2025.112011","url":null,"abstract":"<div><div>The proliferation of the Internet has led to an increase in the usage of rating systems. Inspired by matrix factorization, we present two improved iterative ranking algorithms called L1-AVG-RMF and AA-RMF for rating systems. In the new algorithms, the missing ratings are estimated by matrix factorization before applying traditional ranking algorithms. Theoretical analysis illustrates that the proposed algorithms have a better accuracy and robustness. And it is also demonstrated by Experiments with synthetic and real data. Additionally, experimental results also show that L1-AVG-RMF has superior effectiveness and robustness compared to some other ranking algorithms. Our findings emphasize the potential benefits of applying matrix factorization to ranking algorithms.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112011"},"PeriodicalIF":7.5,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144548747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WANN-DPC: Density peaks finding clustering based on Weighted Adaptive Nearest Neighbors wan - dpc:基于加权自适应近邻的密度峰查找聚类
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-25 DOI: 10.1016/j.patcog.2025.111953
Juanying Xie , Huan Yan , Mingzhao Wang , Philip W. Grant , Witold Pedrycz
{"title":"WANN-DPC: Density peaks finding clustering based on Weighted Adaptive Nearest Neighbors","authors":"Juanying Xie ,&nbsp;Huan Yan ,&nbsp;Mingzhao Wang ,&nbsp;Philip W. Grant ,&nbsp;Witold Pedrycz","doi":"10.1016/j.patcog.2025.111953","DOIUrl":"10.1016/j.patcog.2025.111953","url":null,"abstract":"<div><div>DPC (Density Peak Clustering) algorithm and most of its variants are unable to identify the cluster centers of dense and sparse clusters simultaneously. In addition, the “Domino Effect” of DPC cannot be entirely avoided in its variants. Despite ANN-DPC (Adaptive Nearest Neighbor DPC) being able to detect cluster centers of dense and sparse clusters, its adaptive nearest neighbors of a point may introduce bias in the local density, cluster centers and clustering. To address these limitations of ANN-DPC, the WANN-DPC (Weighted Adaptive Nearest Neighbor DPC) algorithm is proposed. The key contributions of WANN-DPC are as follows: (1) A novel weighted local density of a point is defined by weighting its close and far neighbors, (2) a correction factor is proposed to detect cluster centers in turn, and (3) a two-step assignment strategy is presented utilizing nearest neighbor relationships and weighted membership degrees. Extensive experiments on benchmark datasets demonstrate the superiority of the WANN-DPC over its peers.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111953"},"PeriodicalIF":7.5,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144535801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning region-aware style-content feature transformations for face image beautification 学习区域感知的人脸图像美化风格-内容特征转换
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-25 DOI: 10.1016/j.patcog.2025.111861
Zhen Xu, Si Wu
{"title":"Learning region-aware style-content feature transformations for face image beautification","authors":"Zhen Xu,&nbsp;Si Wu","doi":"10.1016/j.patcog.2025.111861","DOIUrl":"10.1016/j.patcog.2025.111861","url":null,"abstract":"<div><div>As a representative image-to-image translation task, facial makeup transfer is typically performed by applying intermediate feature normalization, conditioned on the style information extracted from a reference image. However, the relevant methods are typically limited in range of applicability, due to that the style information is independent of source images and lack of spatial details. To realize precise makeup transfer and further associate with face component editing, we propose a Semantic Region Style-content Feature Transformation approach, which is referred to as SRSFT. Specifically, we encode both reference and source images into region-wise feature vectors and maps, based on semantic segmentation masks. To address the misalignment in poses and expressions, region-wise spatial transformations are inferred to align the reference and source masks, and are then applied to explicitly warp the reference feature maps to the source face, without any extra supervision. The resulting feature maps are fused with the source ones and inserted into a generator for image synthesis. On the other hand, the reference and source feature vectors are also fused and used to determine the modulation parameters at multiple intermediate layers. SRSFT is able to achieve superior beautification performance in terms of seamlessness and fidelity.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111861"},"PeriodicalIF":7.5,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144518994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Domain-invariant representation learning via SAM for blood cell classification 基于SAM的领域不变表示学习用于血细胞分类
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-06-25 DOI: 10.1016/j.patcog.2025.112000
Yongcheng Li , Lingcong Cai , Ying Lu , Cheng Lin , Yupeng Zhang , Jingyan Jiang , Genan Dai , Bowen Zhang , Jingzhou Cao , Xiangzhong Zhang , Xiaomao Fan
{"title":"Domain-invariant representation learning via SAM for blood cell classification","authors":"Yongcheng Li ,&nbsp;Lingcong Cai ,&nbsp;Ying Lu ,&nbsp;Cheng Lin ,&nbsp;Yupeng Zhang ,&nbsp;Jingyan Jiang ,&nbsp;Genan Dai ,&nbsp;Bowen Zhang ,&nbsp;Jingzhou Cao ,&nbsp;Xiangzhong Zhang ,&nbsp;Xiaomao Fan","doi":"10.1016/j.patcog.2025.112000","DOIUrl":"10.1016/j.patcog.2025.112000","url":null,"abstract":"<div><div>Accurate classification of blood cells is of vital significance in the diagnosis of hematological disorders, facilitating timely treatments for patients. However, in real-world scenarios, domain shifts caused by the variability in laboratory procedures and settings often result in rapid deterioration in model generalization performance. To address this issue, we propose a novel domain-invariant representation learning via the Segment Anything Model (SAM) for blood cell classification, referred to as DoRL. The DoRL comprises two main components: a LoRA-based SAM (LoRA-SAM) and a cross-domain autoencoder (CAE). The key advantage of DoRL is the ability to extract domain-invariant representations from various blood cell datasets in an unsupervised manner. Specifically, we first leverage the large-scale foundation model SAM, fine-tuned with LoRA, to generate robust and transferable visual representations of blood cells. Furthermore, we introduce the CAE to learn domain-invariant representations from the image embeddings across different-domain datasets. The CAE mitigates the impact of image artifacts and other domain-specific variations, ensuring the learned representations more generalizable. To validate the effectiveness of domain-invariant representations, we employ five widely used machine learning classifiers to construct blood cell classification models. Experimental results on two public blood cell datasets and a private real-world dataset demonstrate that our proposed DoRL achieves a new state-of-the-art cross-domain performance, surpassing existing methods by a significant margin. The DoRL, with its novel integration of LoRA-SAM and cross-domain autoencoding, provides a robust and effective solution for enhancing the generalization capabilities of blood cell classification models, potentially improving patient care and outcomes. The source code can be available at <span><span>https://github.com/AnoK3111/DoRL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112000"},"PeriodicalIF":7.5,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144502132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信