Pattern Recognition最新文献_第7页

Domain-invariant representation learning via SAM for blood cell classification 基于SAM的领域不变表示学习用于血细胞分类

IF 7.5 1区计算机科学

Pattern Recognition Pub Date : 2025-06-25 DOI: 10.1016/j.patcog.2025.112000

Yongcheng Li , Lingcong Cai , Ying Lu , Cheng Lin , Yupeng Zhang , Jingyan Jiang , Genan Dai , Bowen Zhang , Jingzhou Cao , Xiangzhong Zhang , Xiaomao Fan

{"title":"Domain-invariant representation learning via SAM for blood cell classification","authors":"Yongcheng Li , Lingcong Cai , Ying Lu , Cheng Lin , Yupeng Zhang , Jingyan Jiang , Genan Dai , Bowen Zhang , Jingzhou Cao , Xiangzhong Zhang , Xiaomao Fan","doi":"10.1016/j.patcog.2025.112000","DOIUrl":"10.1016/j.patcog.2025.112000","url":null,"abstract":"<div><div>Accurate classification of blood cells is of vital significance in the diagnosis of hematological disorders, facilitating timely treatments for patients. However, in real-world scenarios, domain shifts caused by the variability in laboratory procedures and settings often result in rapid deterioration in model generalization performance. To address this issue, we propose a novel domain-invariant representation learning via the Segment Anything Model (SAM) for blood cell classification, referred to as DoRL. The DoRL comprises two main components: a LoRA-based SAM (LoRA-SAM) and a cross-domain autoencoder (CAE). The key advantage of DoRL is the ability to extract domain-invariant representations from various blood cell datasets in an unsupervised manner. Specifically, we first leverage the large-scale foundation model SAM, fine-tuned with LoRA, to generate robust and transferable visual representations of blood cells. Furthermore, we introduce the CAE to learn domain-invariant representations from the image embeddings across different-domain datasets. The CAE mitigates the impact of image artifacts and other domain-specific variations, ensuring the learned representations more generalizable. To validate the effectiveness of domain-invariant representations, we employ five widely used machine learning classifiers to construct blood cell classification models. Experimental results on two public blood cell datasets and a private real-world dataset demonstrate that our proposed DoRL achieves a new state-of-the-art cross-domain performance, surpassing existing methods by a significant margin. The DoRL, with its novel integration of LoRA-SAM and cross-domain autoencoding, provides a robust and effective solution for enhancing the generalization capabilities of blood cell classification models, potentially improving patient care and outcomes. The source code can be available at <span><span>https://github.com/AnoK3111/DoRL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112000"},"PeriodicalIF":7.5,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144502132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dynamic example network for class-agnostic object counting 类无关对象计数的动态示例网络

IF 7.5 1区计算机科学

Pattern Recognition Pub Date : 2025-06-24 DOI: 10.1016/j.patcog.2025.111998

Xinyan Liu , Guorong Li , Yuankai Qi , Ziheng Yan , Weigang Zhang , Laiyun Qing , Qingming Huang

{"title":"Dynamic example network for class-agnostic object counting","authors":"Xinyan Liu , Guorong Li , Yuankai Qi , Ziheng Yan , Weigang Zhang , Laiyun Qing , Qingming Huang","doi":"10.1016/j.patcog.2025.111998","DOIUrl":"10.1016/j.patcog.2025.111998","url":null,"abstract":"<div><div>This work addresses the class-agnostic counting and localization task, a critical challenge in computer vision where the goal is to count and locate objects of any category in an image using a few annotated examples. The primary challenge arises from the limited information on appearance due to the lack of diverse examples, which hampers the model’s ability to generalize to varied object appearances. To tackle this issue, we propose a dynamic example network (DEN), consisting of a Location and Example Decoder module (LEDM) designed to incrementally expand the set of examples and refine predictions through multiple iterations. Additionally, our negative example mining strategy identifies informative negative examples across the entire dataset, further improving the model’s discriminative capacity. Extensive experiments on five datasets—FSC-147, FSCD-LVIS, CARPARK, UAVCC, and Visdrone—demonstrate the effectiveness of our approach, showing marked improvements over several state-of-the-art methods. The source code and trained models will be publicly accessible to facilitate further research and application in the field.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111998"},"PeriodicalIF":7.5,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

VCGPrompt: Visual Concept Graph-Aware Prompt Learning for Vision-Language Models VCGPrompt：视觉语言模型的视觉概念图感知提示学习

IF 7.5 1区计算机科学

Pattern Recognition Pub Date : 2025-06-24 DOI: 10.1016/j.patcog.2025.112012

Mengjia Wang , Fang Liu , Licheng Jiao , Shuo Li , Lingling Li , Puhua Chen , Xu Liu , Wenping Ma

{"title":"VCGPrompt: Visual Concept Graph-Aware Prompt Learning for Vision-Language Models","authors":"Mengjia Wang , Fang Liu , Licheng Jiao , Shuo Li , Lingling Li , Puhua Chen , Xu Liu , Wenping Ma","doi":"10.1016/j.patcog.2025.112012","DOIUrl":"10.1016/j.patcog.2025.112012","url":null,"abstract":"<div><div>Prompt learning enables efficient fine-tuning of visual-language models (VLMs) like CLIP, demonstrating strong transferability across varied downstream tasks. However, adapting VLMs to open-vocabulary tasks is challenging due to the requirement to recognize diverse unseen data, which can cause overfitting and hinder generalization. To address this, we propose Visual Concept Graph-Aware Prompt Learning (VCGPrompt), which constructs visual concept graphs and uses fine-grained text prompts to enrich the general world knowledge of the model. Additionally, we introduce the Visual Concept Graph Aggregation Module (VCGAM) to prioritize the most distinctive visual concepts of each category and guide the learning of relevant visual features, which enhances the capability to perceive the open world. Our method achieves consistent improvements across three diverse generalization settings, including base-to-new, cross-dataset, and domain generalization, with performance gains of up to 0.95%. These results demonstrate the robustness and broad applicability of our approach under various scenarios. Detailed ablation studies and analyses validate the necessity of fine-grained prompts in the open-vocabulary setting.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112012"},"PeriodicalIF":7.5,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144490576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SpIRL: Spatially-aware image representation learning under the supervision of relative position descriptors 相对位置描述符监督下的空间感知图像表示学习

IF 7.5 1区计算机科学

Pattern Recognition Pub Date : 2025-06-24 DOI: 10.1016/j.patcog.2025.112013

Logan Servant , Michaël Clément , Laurent Wendling , Camille Kurtz

{"title":"SpIRL: Spatially-aware image representation learning under the supervision of relative position descriptors","authors":"Logan Servant , Michaël Clément , Laurent Wendling , Camille Kurtz","doi":"10.1016/j.patcog.2025.112013","DOIUrl":"10.1016/j.patcog.2025.112013","url":null,"abstract":"<div><div>Extracting good visual representations from image contents is essential for solving many computer vision problems (e.g. image retrieval, object detection, classification). In this context, state-of-the-art approaches are mainly based on learning a representation using a neural network optimized for a given task. The encoders optimized in this way can then be deployed as backbones for various downstream tasks. When the latter involves reasoning about spatial information from the image content (e.g. retrieve similar structured scenes or compare spatial configurations), this may be suboptimal since models like convolutional neural networks struggle to reason about the relative position of objects in images. Previous studies on building hand-crafted spatial representations, thanks to Relative Position Descriptors (RPD), showed they were powerful to discriminate spatial relations between crisp objects, but such spatial descriptors have rarely been integrated into deep neural networks. We propose in this article different strategies embedded in a common framework called SpIRL (SPatially-aware Image Representation Learning) to guide the optimization of encoders to make them learn more spatial information, under the supervision of an RPD and with the help of a novel dataset (44k images) that does not induce learning semantic information. By using these strategies, we aim to help encoders build more spatially-aware representations. Our experimental results showcase that encoders trained under the SpIRL framework can capture accurate information about the spatial configurations of objects in images on two selected downstream tasks and public datasets.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112013"},"PeriodicalIF":7.5,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144511010","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dynamic selection of Gaussian samples for object detection on drone images via shape sensing 基于形状感知的无人机图像高斯样本的动态选择

IF 7.5 1区计算机科学

Pattern Recognition Pub Date : 2025-06-24 DOI: 10.1016/j.patcog.2025.111978

Yixuan Li , Yulong Xu , Renwu Sun , Pengnian Wu , Meng Zhang

{"title":"Dynamic selection of Gaussian samples for object detection on drone images via shape sensing","authors":"Yixuan Li , Yulong Xu , Renwu Sun , Pengnian Wu , Meng Zhang","doi":"10.1016/j.patcog.2025.111978","DOIUrl":"10.1016/j.patcog.2025.111978","url":null,"abstract":"<div><div>Label assignment (LA) strategy has been extensively studied as a fundamental issue in object detection. However, the drastic scale changes and wide variations in shape (aspect ratio) of objects in drone images result in a sharp performance drop for general LA strategies. To address the above problems, we propose an adaptive Gaussian sample selection strategy for multi-scale objects via shape sensing. Specifically, we first conduct Gaussian modeling for receptive field priors and ground-truth (gt) boxes, ensuring that the non-zero distance metric between any feature point and any ground truth on the whole image is obtained. Subsequently, we theoretically analyze and show that Kullback–Leibler Divergence (KLD) can measure distance according to the characteristics of the object. Taking advantage of this property, we utilize the statistical characteristics of the top-K highest KLD-based matching scores as the positive sample selection threshold for each gt, thereby assigning adequate high-quality samples to multi-scale objects. More importantly, we introduce an adaptive shape-aware strategy that adjusts the sample quantity according to the aspect ratio of objects, guiding the network to balanced learning for multi-scale objects with various shapes. Extensive experiments show that our dynamic shape-aware LA strategy is applicable to a variety of advanced detectors and achieves consistently improved performances on two major benchmarks (i.e., VisDrone and UAVDT), demonstrating the effectiveness of our approach.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111978"},"PeriodicalIF":7.5,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144490562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LiteNeRFAvatar: A lightweight NeRF with local feature learning for dynamic human avatar LiteNeRFAvatar：一个轻量级的NeRF，具有动态人类头像的局部特征学习功能

IF 7.5 1区计算机科学

Pattern Recognition Pub Date : 2025-06-24 DOI: 10.1016/j.patcog.2025.112008

Junjun Pan , Xiaoyu Li , Junxuan Bai , Ju Dai

{"title":"LiteNeRFAvatar: A lightweight NeRF with local feature learning for dynamic human avatar","authors":"Junjun Pan , Xiaoyu Li , Junxuan Bai , Ju Dai","doi":"10.1016/j.patcog.2025.112008","DOIUrl":"10.1016/j.patcog.2025.112008","url":null,"abstract":"<div><div>Creating high-quality dynamic human avatars within acceptable costs remains challenging in computer vision and computer graphics. The neural radiance field (NeRF) has become a fundamental means of generating human avatars due to its success in novel view synthesis. However, the storage-intensive and time-consuming per-scene training due to the transformation and evaluation of massive sampling points constrains its practical applications. In this paper, we introduce a novel lightweight NeRF model, LiteNeRFAvatar, to overcome these limits. To avoid the high-cost backward transformation of the sampling points, LiteNeRFAvatar decomposes the appearance features of clothed humans into multiple local feature spaces and transforms them forward according to human movements. Each local feature space affects a limited local area and is represented by an explicit feature volume created by the tensor decomposition techniques to support fast access. The sampling points retrieve the features based on the relative positions to the local feature spaces. The densities and the colors are then regressed from the aggregated features using a tiny decoder. We also adopt an empty space skipping strategy to further reduce the number of sampling points. Experimental results demonstrate that our LiteNeRFAvatar achieves a satisfactory balance between synthesis quality, training time, rendering speed and parameter size compared to the existing NeRF-based methods. For the demo of our method, please refer to the link on: <span><span>https://youtu.be/UYfreeHtIZY</span><svg><path></path></svg></span>. The source code will be released after the paper is accepted.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112008"},"PeriodicalIF":7.5,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

UR2P-Dehaze: Learning a Simple Image Dehaze Enhancer via Unpaired Rich Physical Prior UR2P-Dehaze：通过Unpaired Rich Physical Prior学习一个简单的图像去霾增强器

IF 7.5 1区计算机科学

Pattern Recognition Pub Date : 2025-06-24 DOI: 10.1016/j.patcog.2025.111997

Minglong Xue , Shuaibin Fan , Shivakumara Palaiahnakote , Mingliang Zhou

{"title":"UR2P-Dehaze: Learning a Simple Image Dehaze Enhancer via Unpaired Rich Physical Prior","authors":"Minglong Xue , Shuaibin Fan , Shivakumara Palaiahnakote , Mingliang Zhou","doi":"10.1016/j.patcog.2025.111997","DOIUrl":"10.1016/j.patcog.2025.111997","url":null,"abstract":"<div><div>Image dehazing techniques aim to enhance contrast and restore details, which are essential for preserving visual information and improving image processing accuracy. Existing methods may struggle to capture the physical characteristics of images fully and deeply, which could limit their ability to reveal image details. To overcome this limitation, we propose an unpaired image dehazing network, called the Simple Image Dehaze Enhancer via Unpaired Rich Physical Prior (UR2P-Dehaze). First, to accurately estimate the illumination, reflectance, and color information of the hazy image, we design a Shared Prior Estimator (SPE) that is iteratively trained to ensure the consistency of illumination and reflectance, generating clear, high-quality images. Additionally, a self-monitoring mechanism is introduced to eliminate undesirable features, providing reliable priors for image reconstruction. Next, we propose Dynamic Wavelet Separable Convolution (DWSC), which effectively integrates key features across both low and high frequencies, significantly enhancing the preservation of image details and ensuring global consistency. Finally, to effectively restore the color information of the image, we propose an Adaptive Color Corrector that addresses the problem of unclear colors. The PSNR, SSIM, LPIPS, FID and CIEDE2000 metrics on the benchmark dataset show that our method achieves state-of-the-art performance. It also contributes to the performance improvement of downstream tasks. The project code is available at <span><span>https://github.com/Fan-pixel/UR2P-Dehaze</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111997"},"PeriodicalIF":7.5,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144523996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Caudo-Diff: Diffusion calibrated pseudo labels in guided latent space for minimally supervised medical segmentation 尾部diff：扩散校准伪标签在引导潜空间为最低监督医学分割

IF 7.5 1区计算机科学

Pattern Recognition Pub Date : 2025-06-24 DOI: 10.1016/j.patcog.2025.112007

Baoqi Yu, Yong Liu

{"title":"Caudo-Diff: Diffusion calibrated pseudo labels in guided latent space for minimally supervised medical segmentation","authors":"Baoqi Yu, Yong Liu","doi":"10.1016/j.patcog.2025.112007","DOIUrl":"10.1016/j.patcog.2025.112007","url":null,"abstract":"<div><div>Accurate segmentation of medical images is essential for clinical diagnosis and treatment planning. However, deep learning-based segmentation models are data-intensive, requiring large, well-annotated datasets which is an often challenging and costly requirement in medical fields. To reduce the reliance on manual labeling, we propose the minimally supervision based on an exemplar, leveraging only a single labeled sample while making full use of the remaining unlabeled data. In this case, two challenges need to be addressed. First, a lack of sufficient prior information: relying solely on a single exemplar limits the model’s ability to capture complex semantics. Second, the unreliability of pseudo labels: noise and inaccuracies in these labels introduce bias, hindering segmentation performance. To overcome these challenges, we propose a new pseudo-labeling paradigm by diffusion calibration. Follow this paradigm, we introduce Caudo-Diff, a novel method for calibrating pseudo labels using a deterministic diffusion model in a guided latent space, aiming to supplement prior information and improve pseudo-labeling reliability. Initial pseudo labels and features extracted by the segmentation network guide the model to focus on meaningful semantic regions. The pseudo labels are then refined to reduce noise and errors, enhancing segmentation accuracy. Experimental results show that Caudo-Diff improves segmentation performance with minimal supervision, offering a practical solution to the challenge of annotation scarcity in medical image segmentation.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112007"},"PeriodicalIF":7.5,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144519058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Unaligned multi-view clustering via diversified anchor graph fusion 基于多元锚图融合的非对齐多视图聚类

IF 7.5 1区计算机科学

Pattern Recognition Pub Date : 2025-06-23 DOI: 10.1016/j.patcog.2025.111977

Hongyu Jiang, Hong Tao, Zhangqi Jiang, Chenping Hou

{"title":"Unaligned multi-view clustering via diversified anchor graph fusion","authors":"Hongyu Jiang, Hong Tao, Zhangqi Jiang, Chenping Hou","doi":"10.1016/j.patcog.2025.111977","DOIUrl":"10.1016/j.patcog.2025.111977","url":null,"abstract":"<div><div>Clear sample correspondence across views is a key presupposition of traditional multi-view clustering. However, in practical applications, uncertainties during the data collection process may lead to the violation of this presupposition, producing unaligned multi-view data. In this paper, to overcome the obstacle of multi-view fusion caused by unaligned samples and achieve efficient unaligned multi-view clustering, a novel Diversified Anchor Graph Fusion (DAGF) method is proposed. Specifically, view-specific bipartite graphs with diversified anchors are constructed to adapt to the characteristics of unaligned multi-view data. Then, with the devised sample alignment and anchor integration strategy, these bipartite graphs are fused to learn a joint bipartite graph with explicit cluster membership structure. The proposed DAGF method not only overcomes the adverse effects of unaligned samples on cross-view information fusion, but also preserves complementary view-specific clustering structure information, enabling efficient and effective clustering. Systematic experimental results on real-world datasets demonstrate the advantages of the DAGF method in both clustering performance and computational complexity. Code available: <span><span>https://github.com/revolution6575/DAGF.git</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 111977"},"PeriodicalIF":7.5,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Learning embedded label-specific features for partial multi-label learning 学习局部多标签学习的嵌入式标签特定特征

IF 7.5 1区计算机科学

Pattern Recognition Pub Date : 2025-06-23 DOI: 10.1016/j.patcog.2025.112005

Xiaohan Xu , Hao Wang , Jialu Yao , Zan Zhang

{"title":"Learning embedded label-specific features for partial multi-label learning","authors":"Xiaohan Xu , Hao Wang , Jialu Yao , Zan Zhang","doi":"10.1016/j.patcog.2025.112005","DOIUrl":"10.1016/j.patcog.2025.112005","url":null,"abstract":"<div><div>Partial multi-label learning (PML) aims to learn from instances with weak supervision, where each instance is associated with a set of candidate labels, among which only a subset is valid. Most existing approaches rely on identical feature representations to distinguish all class labels, overlooking the inherent distinctiveness of different labels, which leads to suboptimal model performance. Although recent studies have attempted to address this limitation by tailoring label-specific features, critical shortcomings remain: (1) isolated processing of feature tailoring and label disambiguation fails to leverage their synergistic relationship, and (2) direct extraction of label-specific features from the original feature space tends to yield unreliable results due to inherent noise and disturbances. This paper proposes a unified PML framework that jointly performs label disambiguation, embedded label-specific feature learning, and model induction. Within this framework, identifying ground-truth labels and generating label-specific features mutually reinforce each other, leading to continuous refinement. By customizing features from a compact and noise-free embedded space, the framework further ensures robustness and reliability in learning. Specifically, low-rank and sparse decomposition is employed to separate ground-truth labels from noisy ones, while a linear embedding discriminant model simultaneously generates embedded label-specific features and induces the model. Moreover, we enhance the classifier’s accuracy by assuming that the input and output spaces share local geometric structures, encouraging similar instances to have similar label sets. Extensive experiments on sixty-six real-world and synthetic datasets demonstrate that the proposed approach significantly outperforms state-of-the-art methods.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"170 ","pages":"Article 112005"},"PeriodicalIF":7.5,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144481442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0