Pattern Recognition最新文献

筛选
英文 中文
Image shadow removal via multi-scale deep Retinex decomposition 通过多尺度深度 Retinex 分解去除图像阴影
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2024-11-02 DOI: 10.1016/j.patcog.2024.111126
{"title":"Image shadow removal via multi-scale deep Retinex decomposition","authors":"","doi":"10.1016/j.patcog.2024.111126","DOIUrl":"10.1016/j.patcog.2024.111126","url":null,"abstract":"<div><div>In recent years, deep learning has emerged as an important tool for image shadow removal. However, existing methods often prioritize shadow detection and, in doing so, they oversimplify the lighting conditions of shadow regions. Furthermore, these methods neglect cues from the overall image lighting when re-lighting shadow areas, thereby failing to ensure global lighting consistency. To address these challenges in images captured under complex lighting conditions, this paper introduces a multi-scale network built on a Retinex decomposition model. The proposed approach effectively senses shadows with uneven lighting and re-light them, achieving greater consistency along shadow boundaries. Furthermore, for the design of network, we introduce several techniques for boosting shadow removal performance, including a shadow-aware channel attention module, local discriminative and Retinex decomposition loss functions, and a multi-scale mechanism for guiding Retinex decomposition by concurrently capturing both fine-grained details and large-scale contextual information. Experimental results demonstrate the superiority of our proposed method over existing solutions, particularly for images taken under complex lighting conditions.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ANNE: Adaptive Nearest Neighbours and Eigenvector-based sample selection for robust learning with noisy labels ANNE:基于自适应近邻和特征向量的样本选择,实现带噪声标签的稳健学习
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2024-11-02 DOI: 10.1016/j.patcog.2024.111132
{"title":"ANNE: Adaptive Nearest Neighbours and Eigenvector-based sample selection for robust learning with noisy labels","authors":"","doi":"10.1016/j.patcog.2024.111132","DOIUrl":"10.1016/j.patcog.2024.111132","url":null,"abstract":"<div><div>An important stage of most state-of-the-art (SOTA) noisy-label learning methods consists of a sample selection procedure that classifies samples from the noisy-label training set into noisy-label or clean-label subsets. The process of sample selection typically consists of one of the two approaches: loss-based sampling, where high-loss samples are considered to have noisy labels, or feature-based sampling, where samples from the same class tend to cluster together in the feature space and noisy-label samples are identified as anomalies within those clusters. Empirically, loss-based sampling is robust to a wide range of noise rates, while feature-based sampling tends to work effectively in particular scenarios, e.g., the filtering of noisy instances via their eigenvectors (FINE) sampling exhibits greater robustness in scenarios with low noise rates, and the K nearest neighbour (KNN) sampling mitigates better high noise-rate problems. This paper introduces the Adaptive Nearest Neighbours and Eigenvector-based (ANNE) sample selection methodology, a novel approach that integrates loss-based sampling with the feature-based sampling methods FINE and Adaptive KNN to optimize performance across a wide range of noise rate scenarios. ANNE achieves this integration by first partitioning the training set into high-loss and low-loss sub-groups using loss-based sampling. Subsequently, within the low-loss subset, sample selection is performed using FINE, while the high-loss subset employs Adaptive KNN for effective sample selection. We integrate ANNE into the noisy-label learning state of the art (SOTA) method SSR+, and test it on CIFAR-10/-100 (with symmetric, asymmetric and instance-dependent noise), Webvision and ANIMAL-10, where our method shows better accuracy than the SOTA in most experiments, with a competitive training time. The code is available at <span><span>https://github.com/filipe-research/anne</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Consistency-driven feature scoring and regularization network for visible–infrared person re-identification 用于可见红外人员再识别的一致性驱动特征评分和正则化网络
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2024-11-02 DOI: 10.1016/j.patcog.2024.111131
{"title":"Consistency-driven feature scoring and regularization network for visible–infrared person re-identification","authors":"","doi":"10.1016/j.patcog.2024.111131","DOIUrl":"10.1016/j.patcog.2024.111131","url":null,"abstract":"<div><div>Recently, visible–infrared person re-identification (VI-ReID) has received considerable attention due to its practical importance. A number of methods extract multiple local features to enrich the diversity of feature representations. However, some local features often involve modality-relevant information, leading to deteriorated performance. Moreover, existing methods optimize the models by only considering the samples at each batch while ignoring the learned features at previous iterations. As a result, the features of the same person images drastically change at different training epochs, hindering the training stability. To alleviate the above issues, we propose a novel consistency-driven feature scoring and regularization network (CFSR-Net), which consists of a backbone network, a local feature learning block, a feature scoring block, and a global–local feature fusion block, for VI-ReID. On the one hand, we design a cross-modality consistency loss to highlight modality-irrelevant local features and suppress modality-relevant local features for each modality, facilitating the generation of a reliable compact local feature. On the other hand, we develop a feature consistency regularization strategy (including a momentum class contrastive loss and a momentum distillation loss) to impose consistency regularization on the learning of different levels of features by considering the learned features at historical epochs. This effectively enables smooth feature changes and thus improves the training stability. Extensive experiments on public VI-ReID datasets clearly show the effectiveness of our method against several state-of-the-art VI-ReID methods. Code will be released at <span><span>https://github.com/cxtjl/CFSR-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142594023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing robust VQA via contrastive and self-supervised learning 通过对比和自我监督学习增强稳健 VQA
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2024-11-02 DOI: 10.1016/j.patcog.2024.111129
{"title":"Enhancing robust VQA via contrastive and self-supervised learning","authors":"","doi":"10.1016/j.patcog.2024.111129","DOIUrl":"10.1016/j.patcog.2024.111129","url":null,"abstract":"<div><div>Visual Question Answering (VQA) aims to evaluate the reasoning abilities of an intelligent agent using visual and textual information. However, recent research indicates that many VQA models rely primarily on learning the correlation between questions and answers in the training dataset rather than demonstrating actual reasoning ability. To address this limitation, we propose a novel training approach called Enhancing Robust VQA via Contrastive and Self-supervised Learning (CSL-VQA) to construct a more robust VQA model. Our approach involves generating two types of negative samples to balance the biased data, using self-supervised auxiliary tasks to help the base VQA model overcome language priors, and filtering out biased training samples. In addition, we construct positive samples by removing spurious correlations in biased samples and perform auxiliary training through contrastive learning. Our approach does not require additional annotations and is compatible with different VQA backbones. Experimental results demonstrate that CSL-VQA significantly outperforms current state-of-the-art approaches, achieving an accuracy of 62.30% on the VQA-CP v2 dataset, while maintaining robust performance on the in-distribution VQA v2 dataset. Moreover, our method shows superior generalization capabilities on challenging datasets such as GQA-OOD and VQA-CE, proving its effectiveness in reducing language bias and enhancing the overall robustness of VQA models.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142594026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
L2T-DFM: Learning to Teach with Dynamic Fused Metric L2T-DFM:利用动态融合指标学习教学
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2024-11-02 DOI: 10.1016/j.patcog.2024.111124
{"title":"L2T-DFM: Learning to Teach with Dynamic Fused Metric","authors":"","doi":"10.1016/j.patcog.2024.111124","DOIUrl":"10.1016/j.patcog.2024.111124","url":null,"abstract":"<div><div>The loss function plays a crucial role in the construction of machine learning algorithms. Employing a teacher model to set loss functions dynamically for student models has attracted attention. In existing works, (1) the characterization of the dynamic loss suffers from some inherent limitations, <em>ie</em>, the computational cost of loss networks and the restricted similarity measurement handcrafted loss functions; and (2) the states of the student model are provided to the teacher model directly without integration, causing the teacher model to underperform when trained on insufficient amounts of data. To alleviate the above-mentioned issues, in this paper, we select and weigh a set of similarity metrics by a confidence-based selection algorithm and a temporal teacher model to enhance the dynamic loss functions. Subsequently, to integrate the states of the student model, we employ statistics to quantify the information loss of the student model. Extensive experiments demonstrate that our approach can enhance student learning and improve the performance of various deep models on real-world tasks, including classification, object detection, and semantic segmentation scenarios.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Self-distillation with beta label smoothing-based cross-subject transfer learning for P300 classification 利用基于贝塔标签平滑的跨主体迁移学习进行 P300 分类的自发散学习
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2024-11-02 DOI: 10.1016/j.patcog.2024.111114
{"title":"Self-distillation with beta label smoothing-based cross-subject transfer learning for P300 classification","authors":"","doi":"10.1016/j.patcog.2024.111114","DOIUrl":"10.1016/j.patcog.2024.111114","url":null,"abstract":"<div><h3>Background:</h3><div>The P300 speller is one of the most well-known brain-computer interface (BCI) systems, offering users a novel way to communicate with their environment by decoding brain activity.</div></div><div><h3>Problem:</h3><div>However, most P300-based BCI systems require a longer calibration phase to develop a subject-specific model, which can be inconvenient and time-consuming. Additionally, it is challenging to implement cross-subject P300 classification due to significant inter-individual variations.</div></div><div><h3>Method:</h3><div>To address these issues, this study proposes a calibration-free approach for P300 signal detection. Specifically, we incorporate self-distillation along with a beta label smoothing method to enhance model generalization and overall system performance, which can not only enable the distillation of informative knowledge from the electroencephalogram (EEG) data of other subjects but effectively reduce individual variability.</div></div><div><h3>Experimental results:</h3><div>The results conducted on the publicly available OpenBMI dataset demonstrate that the proposed method achieves statistically significantly higher performance compared to state-of-the-art approaches. Notably, the average character recognition accuracy of our method reaches up to 97.37% without the need for calibration. And information transfer rate and visualization further confirm its effectiveness.</div></div><div><h3>Significance:</h3><div>This method holds great promise for future developments in BCI applications.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A unified framework for unsupervised action learning via global-to-local motion transformer 通过全局到局部运动变换器实现无监督动作学习的统一框架
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2024-11-01 DOI: 10.1016/j.patcog.2024.111118
{"title":"A unified framework for unsupervised action learning via global-to-local motion transformer","authors":"","doi":"10.1016/j.patcog.2024.111118","DOIUrl":"10.1016/j.patcog.2024.111118","url":null,"abstract":"<div><div>Human action recognition remains challenging due to the inherent complexity arising from the combination of diverse granularity of semantics, ranging from the local motion of body joints to high-level relationships across multiple people. To learn this multi-level characteristic of human action in an unsupervised manner, we propose a novel pretraining strategy along with a transformer-based model architecture named <em>GL-Transformer++</em>. Prior methods in unsupervised action recognition or unsupervised group activity recognition (GAR) have shown limitations, often focusing solely on capturing a partial scope of the action, such as the local movements of each individual or the broader context of the overall motion. To tackle this problem, we introduce a novel pretraining strategy named <em>multi-interval pose displacement prediction (MPDP)</em> that enables the model to learn the diverse extents of the action. In the architectural aspect, we incorporate the <em>global and local attention (GLA)</em> mechanism within the transformer blocks to learn local dynamics between joints, global context of each individual, as well as high-level interpersonal relationships in both spatial and temporal manner. In fact, the proposed method is a unified approach that demonstrates efficacy in both action recognition and GAR. Particularly, our method presents a new and strong baseline, surpassing the current SOTA GAR method by significant margins: 29.6% in Volleyball and 60.3% and 59.9% on the xsub and xset settings of the Mutual NTU dataset, respectively.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GBMOD: A granular-ball mean-shift outlier detector GBMOD:颗粒球均值偏移离群点检测器
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2024-11-01 DOI: 10.1016/j.patcog.2024.111115
{"title":"GBMOD: A granular-ball mean-shift outlier detector","authors":"","doi":"10.1016/j.patcog.2024.111115","DOIUrl":"10.1016/j.patcog.2024.111115","url":null,"abstract":"<div><div>Outlier detection is a crucial data mining task involving identifying abnormal objects, errors, or emerging trends. Mean-shift-based outlier detection techniques evaluate the abnormality of an object by calculating the mean distance between the object and its <span><math><mi>k</mi></math></span>-nearest neighbors. However, in datasets with significant noise, the presence of noise in the <span><math><mi>k</mi></math></span>-nearest neighbors of some objects makes the model ineffective in detecting outliers. Additionally, the mean-shift outlier detection technique depends on finding the <span><math><mi>k</mi></math></span>-nearest neighbors of an object, which can be time-consuming. To address these issues, we propose a granular-ball computing-based mean-shift outlier detection method (GBMOD). Specifically, we first generate high-quality granular-balls to cover the data. By using the centers of granular-balls as anchors, the subsequent mean-shift process can effectively avoid the influence of noise points in the neighborhood. Then, outliers are detected based on the distance from the object to the displaced center of the granular-ball to which it belongs. Finally, the distance between the object and the shifted center of the granular-ball to which the object belongs is calculated, resulting in the outlier scores of objects. Subsequent experiments demonstrate the effectiveness, efficiency, and robustness of the method proposed in this paper.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
FedKT: Federated learning with knowledge transfer for non-IID data FedKT:针对非 IID 数据的具有知识转移功能的联合学习
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2024-11-01 DOI: 10.1016/j.patcog.2024.111143
{"title":"FedKT: Federated learning with knowledge transfer for non-IID data","authors":"","doi":"10.1016/j.patcog.2024.111143","DOIUrl":"10.1016/j.patcog.2024.111143","url":null,"abstract":"<div><div>Federated Learning enables clients to train a joint model collaboratively without disclosing raw data. However, learning over non-IID data may raise performance degeneration, which has become a fundamental bottleneck. Despite numerous efforts to address this issue, challenges such as excessive local computational burdens and reliance on shared data persist, rendering them impractical in real-world scenarios. In this paper, we propose a novel federated knowledge transfer framework to overcome data heterogeneity issues. Specifically, a model segmentation distillation method and a learnable aggregation network are developed for server-side knowledge ensemble and transfer, while a client-side consistency-constrained loss is devised to rectify local updates, thereby enhancing both global and client models. The framework considers both diversity and consistency among clients and can serve as a general solution for extracting knowledge from distributed nodes. Extensive experiments on four datasets demonstrate our framework’s effectiveness, achieving superior performance compared to advanced competitors in high-heterogeneity settings.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142593958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data augmentation strategies for semi-supervised medical image segmentation 半监督医学图像分割的数据增强策略
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2024-11-01 DOI: 10.1016/j.patcog.2024.111116
{"title":"Data augmentation strategies for semi-supervised medical image segmentation","authors":"","doi":"10.1016/j.patcog.2024.111116","DOIUrl":"10.1016/j.patcog.2024.111116","url":null,"abstract":"<div><div>Exploiting unlabeled and labeled data augmentations has become considerably important for semi-supervised medical image segmentation tasks. However, existing data augmentation methods, such as Cut-mix and generative models, typically dependent on consistency regularization or ignore data correlation between slices. To address cognitive biases problems, we propose two novel data augmentation strategies and a Dual Attention-guided Consistency network (DACNet) to improve semi-supervised medical image segmentation performance significantly. For labeled data augmentation, we randomly crop and stitch annotated data rather than unlabeled data to create mixed annotated data, which breaks the anatomical structures and introduces voxel-level uncertainty in limited annotated data. For unlabeled data augmentation, we combine the diffusion model with the Laplacian pyramid fusion strategy to generate unlabeled data with higher slice correlation. To enhance the decoders to learn different semantic but discriminative features, we propose the DACNet to achieve structural differentiation by introducing spatial and channel attention into the decoders. Extensive experiments are conducted to show the effectiveness and generalization of our approach. Specifically, our proposed labeled and unlabeled data augmentation strategies improved accuracy by 0.3% to 16.49% and 0.22% to 1.72%, respectively, when compared with various state-of-the-art semi-supervised methods. Furthermore, our DACNet outperforms existing methods on three medical datasets (91.72% dice score with 20% labeled data on the LA dataset). Source code will be publicly available at <span><span>https://github.com/Oubit1/DACNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":7.5,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142594024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信