Pattern Recognition Letters最新文献

筛选
英文 中文
Attribute disentanglement and re-entanglement for generalized zero-shot learning 广义零点学习的属性反纠缠和再纠缠
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-09-12 DOI: 10.1016/j.patrec.2024.09.007
{"title":"Attribute disentanglement and re-entanglement for generalized zero-shot learning","authors":"","doi":"10.1016/j.patrec.2024.09.007","DOIUrl":"10.1016/j.patrec.2024.09.007","url":null,"abstract":"<div><p>The key challenge in zero-shot learning is inferring latent semantic knowledge between visual and attribute features of seen classes to achieve knowledge transfer to unseen classes. To address the limitation that local attribute features can only ensure attribute-level recognition rather than classification of an entire class, some methods incorporate global information into the process or results of local features extraction for classification. However, these approaches have not effectively addressed the issue. To address these issues, we propose an Attribute Disentanglement and Re-entanglement for Generalized Zero-Shot Learning. Our model no longer implicitly or explicitly incorporates global information into local attribute features for classification. Instead, we adjust local attribute features to make them more suitable for classification in re-entanglement phase, while ensuring the correct extraction of these features in disentanglement phase. We employ appropriate optimization loss functions and achieve significant improvements on three challenging benchmark datasets. Compared to other similar methods, our model exhibits strong competitiveness.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142258892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DeepSet SimCLR: Self-supervised deep sets for improved pathology representation learning DeepSet SimCLR:用于改进病理表征学习的自监督深度集
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-09-12 DOI: 10.1016/j.patrec.2024.09.005
{"title":"DeepSet SimCLR: Self-supervised deep sets for improved pathology representation learning","authors":"","doi":"10.1016/j.patrec.2024.09.005","DOIUrl":"10.1016/j.patrec.2024.09.005","url":null,"abstract":"<div><p>Often, applications of self-supervised learning to 3D medical data opt to use 3D variants of successful 2D network architectures. Although promising approaches, they are significantly more computationally demanding to train, and thus reduce the widespread applicability of these methods away from those with modest computational resources. Thus, in this paper, we aim to improve standard 2D SSL algorithms by modelling the inherent 3D nature of these datasets implicitly. We propose two variants that build upon a strong baseline model and show that both of these variants often outperform the baseline in a variety of downstream tasks. Importantly, in contrast to previous works in both 2D and 3D approaches for 3D medical data, both of our proposals introduce negligible additional overhead in terms of parameter complexity. Although data loading overhead increases over the baseline SimCLR model (which we can show can be somewhat mitigated through parallelisation), our proposed models are still significantly more efficient than previous approaches based on sequence modelling. Overall, our proposed methods help improve the democratisation of these approaches for medical applications.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167865524002617/pdfft?md5=d21c284a16daedc994fdee00b9067807&pid=1-s2.0-S0167865524002617-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142229282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Use estimated signal and noise to adjust step size for image restoration 利用估计的信号和噪声调整图像修复的步长
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-09-11 DOI: 10.1016/j.patrec.2024.09.006
{"title":"Use estimated signal and noise to adjust step size for image restoration","authors":"","doi":"10.1016/j.patrec.2024.09.006","DOIUrl":"10.1016/j.patrec.2024.09.006","url":null,"abstract":"<div><p>Image deblurring is a challenging inverse problem, especially when there is additive noise to the observation. To solve such an inverse problem in an iterative manner, it is important to control the step size for achieving a stable and robust performance. We designed a method that controls the progress of iterative process in solving the inverse problem without the need for a user-specified step size. The method searches for an optimal step size under the assumption that the signal and noise are two independent stochastic processes. Experiments show that the method can achieve good performance in the presence of noise and imperfect knowledge about the blurring kernel. Tests also show that, for different blurring kernels and noise levels, the difference between two consecutive estimates given by the new method tends to remain more stable and stay in a smaller range, as compared to those given by some existing techniques. This stable feature makes the new method more robust in the sense that it is easier to select a stopping threshold for the new method to use in different scenarios.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142171797","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
One-index vector quantization based adversarial attack on image classification 基于单索引向量量化的图像分类对抗攻击
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-09-06 DOI: 10.1016/j.patrec.2024.09.001
{"title":"One-index vector quantization based adversarial attack on image classification","authors":"","doi":"10.1016/j.patrec.2024.09.001","DOIUrl":"10.1016/j.patrec.2024.09.001","url":null,"abstract":"<div><p>To improve storage and transmission, images are generally compressed. Vector quantization (VQ) is a popular compression method as it has a high compression ratio that suppresses other compression techniques. Despite this, existing adversarial attack methods on image classification are mostly performed in the pixel domain with few exceptions in the compressed domain, making them less applicable in real-world scenarios. In this paper, we propose a novel one-index attack method in the VQ domain to generate adversarial images by a differential evolution algorithm, successfully resulting in image misclassification in victim models. The one-index attack method modifies a single index in the compressed data stream so that the decompressed image is misclassified. It only needs to modify a single VQ index to realize an attack, which limits the number of perturbed indexes. The proposed method belongs to a semi-black-box attack, which is more in line with the actual attack scenario. We apply our method to attack three popular image classification models, i.e., Resnet, NIN, and VGG16. On average, 55.9 % and 77.4 % of the images in CIFAR-10 and Fashion MNIST, respectively, are successfully attacked, with a high level of misclassification confidence and a low level of image perturbation.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167865524002575/pdfft?md5=96833f101476805d73c37d5dd7083f2c&pid=1-s2.0-S0167865524002575-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142171796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CAST: Clustering self-Attention using Surrogate Tokens for efficient transformers CAST:使用代用标记进行自关注聚类,实现高效变压器
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-09-06 DOI: 10.1016/j.patrec.2024.08.024
{"title":"CAST: Clustering self-Attention using Surrogate Tokens for efficient transformers","authors":"","doi":"10.1016/j.patrec.2024.08.024","DOIUrl":"10.1016/j.patrec.2024.08.024","url":null,"abstract":"<div><p>The Transformer architecture has shown to be a powerful tool for a wide range of tasks. It is based on the self-attention mechanism, which is an inherently computationally expensive operation with quadratic computational complexity: memory usage and compute time increase quadratically with the length of the input sequences, thus limiting the application of Transformers. In this work, we propose a novel Clustering self-Attention mechanism using Surrogate Tokens (CAST), to optimize the attention computation and achieve efficient transformers. CAST utilizes learnable surrogate tokens to construct a cluster affinity matrix, used to cluster the input sequence and generate novel cluster summaries. The self-attention from within each cluster is then combined with the cluster summaries of other clusters, enabling information flow across the entire input sequence. CAST improves efficiency by reducing the complexity from <span><math><mrow><mi>O</mi><mrow><mo>(</mo><msup><mrow><mi>N</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>)</mo></mrow></mrow></math></span> to <span><math><mrow><mi>O</mi><mrow><mo>(</mo><mi>α</mi><mi>N</mi><mo>)</mo></mrow></mrow></math></span> where <span><math><mi>N</mi></math></span> is the sequence length, and <span><math><mi>α</mi></math></span> is constant according to the number of clusters and samples per cluster. We show that CAST performs better than or comparable to the baseline Transformers on long-range sequence modeling tasks, while also achieving higher results on time and memory efficiency than other efficient transformers.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167865524002563/pdfft?md5=41d75a76c8436c27473bdc1f0c0144be&pid=1-s2.0-S0167865524002563-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142164130","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial for pattern recognition letters special issue on Advances in Disinformation Detection and Media Forensics 为《模式识别字母》"虚假信息检测和媒体取证的进展 "特刊撰写编辑文章
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-09-05 DOI: 10.1016/j.patrec.2024.09.004
{"title":"Editorial for pattern recognition letters special issue on Advances in Disinformation Detection and Media Forensics","authors":"","doi":"10.1016/j.patrec.2024.09.004","DOIUrl":"10.1016/j.patrec.2024.09.004","url":null,"abstract":"","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142164129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SiamMAF: A multipath and feature-enhanced thermal infrared tracker SiamMAF:多路径和特征增强型热红外跟踪器
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-09-03 DOI: 10.1016/j.patrec.2024.09.003
{"title":"SiamMAF: A multipath and feature-enhanced thermal infrared tracker","authors":"","doi":"10.1016/j.patrec.2024.09.003","DOIUrl":"10.1016/j.patrec.2024.09.003","url":null,"abstract":"<div><p>Thermal infrared (TIR) images are visually blurred and low in information content. Some TIR trackers focus on enhancing the semantic information of TIR features, neglecting the equally important detailed information for TIR tracking. After target localization, detailed information can assist the tracker in generating accurate prediction boxes. In addition, simple element-wise addition is not a way to fully utilize and fuse multiple response maps. To address these issues, this study proposes a multipath and feature-enhanced Siamese tracker (SiamMAF) for TIR tracking. We design a feature-enhanced module (FEM) based on complementarity, which can highlight the key semantic information of the target and preserve the detailed information of objects. Furthermore, we introduce a response fusion module (RFM) that can adaptively fuse multiple response maps. Extensive experimental results on two challenging benchmarks show that SiamMAF outperforms many existing state-of-the-art TIR trackers and runs at a steady 31FPS.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142169501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Visual speech recognition using compact hypercomplex neural networks 使用紧凑超复杂神经网络进行视觉语音识别
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-09-03 DOI: 10.1016/j.patrec.2024.09.002
{"title":"Visual speech recognition using compact hypercomplex neural networks","authors":"","doi":"10.1016/j.patrec.2024.09.002","DOIUrl":"10.1016/j.patrec.2024.09.002","url":null,"abstract":"<div><p>Recent progress in visual speech recognition systems due to advances in deep learning and large-scale public datasets has led to impressive performance compared to human professionals. The potential applications of these systems in real-life scenarios are numerous and can greatly benefit the lives of many individuals. However, most of these systems are not designed with practicality in mind, requiring large-size models and powerful hardware, factors which limit their applicability in resource-constrained environments and other real-world tasks. In addition, few works focus on developing lightweight systems that can be deployed in such conditions. Considering these issues, we propose compact networks that take advantage of hypercomplex layers that utilize a sum of Kronecker products to reduce overall parameter demands and model sizes. We train and evaluate our proposed models on the largest public dataset for single word speech recognition for English. Our experiments show that high compression rates are achievable with a minimal accuracy drop, indicating the method’s potential for practical applications in lower-resource environments. Code and models are available at <span><span>https://github.com/jpanagos/vsr_phm</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142164128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A method for evaluating deep generative models of images for hallucinations in high-order spatial context 评估高阶空间背景下幻觉图像深度生成模型的方法
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-09-02 DOI: 10.1016/j.patrec.2024.08.023
{"title":"A method for evaluating deep generative models of images for hallucinations in high-order spatial context","authors":"","doi":"10.1016/j.patrec.2024.08.023","DOIUrl":"10.1016/j.patrec.2024.08.023","url":null,"abstract":"<div><p>Deep generative models (DGMs) have the potential to revolutionize diagnostic imaging. Generative adversarial networks (GANs) are one kind of DGM which are widely employed. The overarching problem with deploying any sort of DGM in mission-critical applications is a lack of adequate and/or automatic means of assessing the domain-specific quality of generated images. In this work, we demonstrate several objective and human-interpretable tests of images output by two popular DGMs. These tests serve two goals: (i) ruling out DGMs for downstream, domain-specific applications, and (ii) quantifying hallucinations in the expected spatial context in DGM-generated images. The designed datasets are made public and the proposed tests could also serve as benchmarks and aid the prototyping of emerging DGMs. Although these tests are demonstrated on GANs, they can be employed as a benchmark for evaluating any DGM. Specifically, we designed several stochastic context models (SCMs) of distinct image features that can be recovered after generation by a trained DGM. Together, these SCMs encode features as per-image constraints in prevalence, position, intensity, and/or texture. Several of these features are high-order, algorithmic pixel-arrangement rules which are not readily expressed in covariance matrices. We designed and validated statistical classifiers to detect specific effects of the known arrangement rules. We then tested the rates at which two different DGMs correctly reproduced the feature context under a variety of training scenarios, and degrees of feature-class similarity. We found that ensembles of generated images can appear largely accurate visually, and show high accuracy in ensemble measures, while not exhibiting the known spatial arrangements. The main conclusion is that SCMs can be engineered, and serve as benchmarks, to quantify numerous <em>per image</em> errors, <em>i.e.</em>, hallucinations, that may not be captured in ensemble statistics but plausibly can affect subsequent use of the DGM-generated images.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0167865524002551/pdfft?md5=5df7937160b427d56d6a3c847ac5fdfc&pid=1-s2.0-S0167865524002551-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142164131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Introduction to the special section “Advances trends of pattern recognition for intelligent systems applications” (SS:ISPR23) 特别单元 "智能系统应用模式识别的进展趋势"(SS:ISPR23)介绍
IF 3.9 3区 计算机科学
Pattern Recognition Letters Pub Date : 2024-09-01 DOI: 10.1016/j.patrec.2024.08.005
{"title":"Introduction to the special section “Advances trends of pattern recognition for intelligent systems applications” (SS:ISPR23)","authors":"","doi":"10.1016/j.patrec.2024.08.005","DOIUrl":"10.1016/j.patrec.2024.08.005","url":null,"abstract":"","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":null,"pages":null},"PeriodicalIF":3.9,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142185733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信