Pattern Recognition最新文献

筛选
英文 中文
The MERIT dataset: Modelling and efficiently rendering interpretable transcripts MERIT数据集:建模和有效地呈现可解释的转录本
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-29 DOI: 10.1016/j.patcog.2025.112502
Ignacio de Rodrigo, Alberto Sanchez-Cuadrado, Jaime Boal, Alvaro J. Lopez-Lopez
{"title":"The MERIT dataset: Modelling and efficiently rendering interpretable transcripts","authors":"Ignacio de Rodrigo,&nbsp;Alberto Sanchez-Cuadrado,&nbsp;Jaime Boal,&nbsp;Alvaro J. Lopez-Lopez","doi":"10.1016/j.patcog.2025.112502","DOIUrl":"10.1016/j.patcog.2025.112502","url":null,"abstract":"<div><div>This paper introduces the MERIT Dataset, a multimodal, fully labeled dataset of school grade reports. Comprising over 400 labels and 33k samples, the MERIT Dataset is a resource for training models in demanding Visually-rich Document Understanding tasks. It contains multimodal features that link patterns in the textual, visual, and layout domains. The MERIT Dataset also includes biases in a controlled way, making it a valuable tool to benchmark biases induced in Language Models. The paper outlines the dataset’s generation pipeline and highlights its main features and patterns in its different domains. We benchmark the dataset for token classification, showing that it poses a significant challenge even for SOTA models.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112502"},"PeriodicalIF":7.6,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Boundary-aware shape recognition using dynamic graph convolutional networks 基于动态图卷积网络的边界感知形状识别
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-29 DOI: 10.1016/j.patcog.2025.112511
Jinming Zhao , Junyu Dong , Huiyu Zhou , Xinghui Dong
{"title":"Boundary-aware shape recognition using dynamic graph convolutional networks","authors":"Jinming Zhao ,&nbsp;Junyu Dong ,&nbsp;Huiyu Zhou ,&nbsp;Xinghui Dong","doi":"10.1016/j.patcog.2025.112511","DOIUrl":"10.1016/j.patcog.2025.112511","url":null,"abstract":"<div><div>Shape recognition, which often involves topology in mathematics, is a fundamental subfield of image recognition. Although deep learning techniques have been widely applied to image recognition and have achieved great success, this is not the case for 2D shape recognition. Inspired by the powerful spatial representation ability of Graph Convolutional Networks (GCNs), we leverage this technique to address the shape recognition problem. To this end, we propose a Boundary-Aware Shape Recognition Graph Convolutional Network (BASR-GCN). To be specific, we first extract the maximum boundary of the object depicted in an image and sample this boundary into a set of key points. Given a key point, a set of features is then extracted as its representation. Furthermore, we construct a series of graphs from the key points and use the BASR-GCN to learn the spatial layout of these points. In addition, we introduce a multi-scale BASR-GCN (BASR-GCN-MS) in order to exploit the shape features extracted at different scales. To our knowledge, GCNs have not been applied to 2D shape recognition before. The proposed method is tested using four publicly available shape data sets. Experimental results show that our method outperforms the baselines. We believe that these promising results should be due to the fact that the BASR-GCN captures the spatial layout and semantic information of the shape fulfilled by graph convolutions.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112511"},"PeriodicalIF":7.6,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unified complex-valued high-resolution frequency epresentation with nsattention for iological ognition 统一的非注意复值高分辨率频率表示用于生理认知
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-28 DOI: 10.1016/j.patcog.2025.112488
Ye Qiu , Zhenmiao Deng , Xiaohong Huang
{"title":"Unified complex-valued high-resolution frequency epresentation with nsattention for iological ognition","authors":"Ye Qiu ,&nbsp;Zhenmiao Deng ,&nbsp;Xiaohong Huang","doi":"10.1016/j.patcog.2025.112488","DOIUrl":"10.1016/j.patcog.2025.112488","url":null,"abstract":"<div><div>High-resolution frequency domain analysis is pivotal in a wide range of critical applications, including physiological signal processing, radar target detection, and communication systems. In this study, we present a complex-valued neural network designed for accurate estimation of frequency components encompassing both magnitude and phase, the Unified-Complex High-Resolution Frequency Representation Module (UHFreq). This method generates comprehensive high-resolution frequency domain representations, addressing key limitations in current approaches that typically capture only amplitude information, omit crucial phase details, and suffer from low resolution in frequency domain outputs. Furthermore, conventional methods for physiological signal detection and recognition require meticulous preprocessing steps, including demodulation and filtering. In response to these challenges, we propose UHFreq-based Vital Sign Status Detection Network (UVSD-Net), an application example of UHFreq, which classifies different human physiological states starting from raw radar echoes. This model utilizes the UHFreq structure as the frontend for the frequency domain representation of physiological signals from raw radar echoes. The UVSD-Net architecture incorporates a dual-pathway design: one pathway processes frequency domain features via UHFreq, while the other applies time domain amplitude and phase information from the raw radar signals. Furthermore, a weight redistribution mechanism is introduced across the different feature domains to enhance cross-domain feature integration and interaction. This comprehensive end-to-end framework offers a robust approach for analyzing time domain original signals and enables effective execution of downstream tasks.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112488"},"PeriodicalIF":7.6,"publicationDate":"2025-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scale parallax network for few-shot learning 缩放视差网络用于少镜头学习
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-28 DOI: 10.1016/j.patcog.2025.112504
Ran Chen , Wen Jiang , Jinbiao Zhu , Jie Geng
{"title":"Scale parallax network for few-shot learning","authors":"Ran Chen ,&nbsp;Wen Jiang ,&nbsp;Jinbiao Zhu ,&nbsp;Jie Geng","doi":"10.1016/j.patcog.2025.112504","DOIUrl":"10.1016/j.patcog.2025.112504","url":null,"abstract":"<div><div>Varying the input image scale allows convolutional networks to extract different features and learn richer image representations. This serves as a form of data augmentation and helps address the few-shot learning challenges. While historical few-shot learning methods have focused on multi-scale feature fusion using techniques such as random resizing or feature pyramids, the exploration of inter-scale feature differences has largely been overlooked. Unlike previous methods, we propose a novel few-shot learning approach, the Scale Parallax Network, which treats images at different resolutions as complementary sources of visual information. We adopt an image-pyramid-based structure to extract multi-scale feature representations and enhance the model representational capacity. Experimental results demonstrate that our method achieves state-of-the-art performance on the miniImageNet and tieredImageNet datasets.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112504"},"PeriodicalIF":7.6,"publicationDate":"2025-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220482","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Structuring the processing frameworks for data stream evaluation and application 构建数据流评估和应用的处理框架
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-28 DOI: 10.1016/j.patcog.2025.112516
Joanna Komorniczak , Paweł Ksieniewicz , Paweł Zyblewski
{"title":"Structuring the processing frameworks for data stream evaluation and application","authors":"Joanna Komorniczak ,&nbsp;Paweł Ksieniewicz ,&nbsp;Paweł Zyblewski","doi":"10.1016/j.patcog.2025.112516","DOIUrl":"10.1016/j.patcog.2025.112516","url":null,"abstract":"<div><div>The following work addresses the problem of <em>frameworks</em> for data stream processing that can be used to evaluate the solutions in an environment that resembles real-world applications. The definition of structured frameworks stems from the need to reliably assess data stream classification methods, considering the constraints of <em>delayed</em> label access, the costs of their acquisition and the costs of model adaptation. The current experimental evaluation often boundlessly exploits the assumption of the immediate label access to monitor the recognition quality and adapt the methods to the changing concepts. The problem is leveraged by reviewing currently described methods and techniques for <em>data stream processing</em> and verifying their outcomes in <em>simulated environment</em>. This work defines a taxonomy of <em>data stream processing frameworks</em> and presents four processing schemes that link the tasks of <em>drift detection</em> and <em>classification</em> while considering a natural phenomenon of <em>label delay</em>. The presented research shows that classification quality is significantly affected not only by the disruptive phenomenon of <em>concept drifts</em> and <em>label delay</em>, but also by the undertaken processing scheme that describes the flow of labels in the recognition system. Considering a specific processing framework depending on real-world constraints proves to be a critical aspect of reliable and realistic experimental evaluation.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112516"},"PeriodicalIF":7.6,"publicationDate":"2025-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gaussian splitting attack: Gaussian splatting-based multi-view 3D adversarial attack 高斯劈裂攻击:基于高斯劈裂的多视角3D对抗性攻击
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-28 DOI: 10.1016/j.patcog.2025.112466
Lingzhuang Meng , Mingwen Shao , Yuanjian Qiao , Wenjie Liu , Xiang Lv
{"title":"Gaussian splitting attack: Gaussian splatting-based multi-view 3D adversarial attack","authors":"Lingzhuang Meng ,&nbsp;Mingwen Shao ,&nbsp;Yuanjian Qiao ,&nbsp;Wenjie Liu ,&nbsp;Xiang Lv","doi":"10.1016/j.patcog.2025.112466","DOIUrl":"10.1016/j.patcog.2025.112466","url":null,"abstract":"<div><div>Existing multi-view adversarial attack methods utilize Neural Radiance Fields (NeRF) to generate adversarial samples from different viewpoints of an object effectively deceiving deep neural networks. However, these methods <em>simply add noise to the rendered images and fail to construct explicit 3D adversarial samples</em> limited by the implicit representation of NeRF. To address the above limitation, we propose a novel <strong>G</strong>aussian <strong>S</strong>plitting <strong>Attack</strong> (<strong>GSAttack</strong>) scheme based on Gaussian Splatting to <strong>generate explicit 3D adversarial samples that deceive the classifier in various viewpoints</strong>. Specifically, we first quantify the contribution of each Gaussian based on its gradient in adversarial attack. Subsequently, we split tiny Gaussians from the high contribution Gaussians as initial 3D perturbations, which are then optimized by adversarial loss to ensure deception in diverse viewpoints. Furthermore, to ensure the invisibility of 3D perturbation, we devise position and color losses to make the perturbations tightly bound to the object surface and minimize the color differences. Owing to these ingenious designs, our 3D perturbations are more natural in space and effective attack neural network. Experimental results show that the 3D adversarial samples generated by our GSAttack can effectively deceive the classifier over a wider range of viewpoints and achieve superior visualization compared to existing schemes.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112466"},"PeriodicalIF":7.6,"publicationDate":"2025-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wavelet-based physically guided normalization network for real-time traffic dehazing 基于小波物理引导归一化网络的实时交通去雾
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-27 DOI: 10.1016/j.patcog.2025.112451
Shengdong Zhang , Xiaoqin Zhang , Linlin Shen , Shaohua Wan , Wenqi Ren
{"title":"Wavelet-based physically guided normalization network for real-time traffic dehazing","authors":"Shengdong Zhang ,&nbsp;Xiaoqin Zhang ,&nbsp;Linlin Shen ,&nbsp;Shaohua Wan ,&nbsp;Wenqi Ren","doi":"10.1016/j.patcog.2025.112451","DOIUrl":"10.1016/j.patcog.2025.112451","url":null,"abstract":"<div><div>Single image Dehazing is a pressing task in everyday life, with deep learning having facilitated numerous research advancements. However, the field of image Dehazing is currently encountering a bottleneck. We can identify two primary reasons for the difficulty in further enhancing Dehazing quality. First, Convolutional Neural Networks (CNNs) struggle to capture long-range dependencies. Second, haze causes pixels that are similar in haze-free images to diverge in appearance. To address these challenges simultaneously, we propose a Wavelet-Based Physically Guided Normalization Dehazing Network (WBPGNDN). Specifically, we introduce a physically guided Normalization designed to restore the similarity of pixels as seen in haze-free images. Additionally, we utilize Wavelet Decomposition to seize long-range dependencies. While traditional methods typically apply wavelet decomposition in the image domain, we instead implement it in the feature domain. Experiments on both real and simulated hazy images demonstrate the Dehazing efficacy of our proposed method. The extensive results indicate that our approach matches or surpasses state-of-the-art methods, yielding high-quality visual outcomes and effectively addressing the limitations of existing methods.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112451"},"PeriodicalIF":7.6,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Anisotropic pth-order TV-based Retinex decomposition with adaptive reflectance regularizer for low-light image enhancement 基于各向异性p阶电视的自适应反射率正则化Retinex分解在微光图像增强中的应用
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-27 DOI: 10.1016/j.patcog.2025.112468
Po-Wen Hsieh , Suh-Yuh Yang
{"title":"Anisotropic pth-order TV-based Retinex decomposition with adaptive reflectance regularizer for low-light image enhancement","authors":"Po-Wen Hsieh ,&nbsp;Suh-Yuh Yang","doi":"10.1016/j.patcog.2025.112468","DOIUrl":"10.1016/j.patcog.2025.112468","url":null,"abstract":"<div><div>Image enhancement plays a fundamental role in image processing and computer vision. Its primary purpose is to improve the visual quality of an image by enhancing its contrast and brightness. However, most existing enhancement methods tend to amplify the imaging noise, especially in very dark regions of the image, leading to undesirable artifacts in the enhanced result. To address this problem, this paper aims to develop a method that enhances low-light images without introducing these artifacts. We propose a novel anisotropic <span><math><mi>p</mi></math></span>th-order total variation-based (ApTV-based) Retinex decomposition with an adaptive reflectance regularizer for low-light image enhancement, where <span><math><mi>p</mi></math></span> represents the exponent in our regularization term, controlling the degree of structure preservation in the resulting image. Specifically, for <span><math><mrow><mn>0</mn><mo>&lt;</mo><mi>p</mi><mo>≤</mo><mn>1</mn></mrow></math></span>, the ApTV with a smaller <span><math><mi>p</mi></math></span>-value can effectively extract strong structures of the image, making it suitable for piecewise smooth illumination estimation. In contrast, a larger <span><math><mi>p</mi></math></span>-value can help preserve the image’s fine details and suppress noise, making it favorable for accurate reflectance estimation. More importantly, since the degree of noise amplification varies across different regions, we incorporate the obtained illumination into the reflectance regularizer to enable adaptive denoising. Extensive numerical experiments and comparisons with state-of-the-art low-light image enhancement methods demonstrate that the proposed adaptive Retinex decomposition approach achieves superior performance both qualitatively and quantitatively. It effectively addresses noise amplification and artifact issues while enhancing overall image quality.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112468"},"PeriodicalIF":7.6,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generative models for noise-robust training in unsupervised domain adaptation 无监督域自适应噪声鲁棒训练的生成模型
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-27 DOI: 10.1016/j.patcog.2025.112450
Zhongying Deng , Da Li , Junjun He , Xiaojiang Peng , Yi-Zhe Song , Tao Xiang
{"title":"Generative models for noise-robust training in unsupervised domain adaptation","authors":"Zhongying Deng ,&nbsp;Da Li ,&nbsp;Junjun He ,&nbsp;Xiaojiang Peng ,&nbsp;Yi-Zhe Song ,&nbsp;Tao Xiang","doi":"10.1016/j.patcog.2025.112450","DOIUrl":"10.1016/j.patcog.2025.112450","url":null,"abstract":"<div><div>Recent unsupervised domain adaptation (UDA) methods show the effectiveness of pseudo-labels for unlabeled target domain. However, pseudo-labels inevitably contain noise, which can degrade adaptation performance. This paper thus propose a Generative models for Noise-Robust Training (GeNRT), a method designed to mitigate label noise while reducing domain shift. The key idea is leveraging the class-wise distributions of the target domain, modeled by generative models, provide more reliable pseudo-labels than individual pseudo-labeled instances. This is because the distributions statistically better represent class-wise information than a single instance. Based on this observation, GeNRT incorporates a Distribution-based Class-wise Feature Augmentation (D-CFA), which enhances feature representations by sampling features from target class distributions modeled by generative models. These augmented features serve a dual purpose: (1) providing class-level knowledge from generative models to train a noise-robust discriminative classifier, and (2) acting as intermediate features to bridge the domain gap at the class level. Furthermore, GeNRT leverages Generative and Discriminative Consistency (GDC), enforcing consistency regularization between a generative classifier (formed by all class-wise generative models) and the learned discriminative classifier. By aggregating knowledge across target class distributions, GeNRT improves pseudo-label reliability and enhances robustness against label noise. Extensive experiments on Office-Home, VisDA-2017, PACS, and Digit-Five show that our GeNRT achieves comparable performance to state-of-the-art methods under both single-source and multi-source UDA settings.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112450"},"PeriodicalIF":7.6,"publicationDate":"2025-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel vision transformer with selective residual in multihead self-attention for pattern recognition 一种基于多头自注意残差的模式识别视觉变压器
IF 7.6 1区 计算机科学
Pattern Recognition Pub Date : 2025-09-26 DOI: 10.1016/j.patcog.2025.112497
Arun Kumar Sharma, Nishchal K. Verma
{"title":"A novel vision transformer with selective residual in multihead self-attention for pattern recognition","authors":"Arun Kumar Sharma,&nbsp;Nishchal K. Verma","doi":"10.1016/j.patcog.2025.112497","DOIUrl":"10.1016/j.patcog.2025.112497","url":null,"abstract":"<div><div>Intelligent fault diagnosis requires robust capturing of specific features, representing the fault patterns, from time-series vibration signals. Most of the existing solutions require complex preprocessing steps to make the signal suitable for training a deep learning model. This article presents a novel vision transformer with a selective residual in the multihead self-attention network, called Selective Residual Vision Transformer (SeReViT), for improved robustness in capturing the fault signature from the vibration signal. The novel attention mechanism incorporates cumulative attention by utilizing the best attention through residual connections in each block of multihead attention. The best attention term is defined using the highest value of L1-norms of attention value (the scaled-dot product of key and query) of multiheads. It enables the model to focus on selected best attention to learn the long-range dependencies among sequential input image patches, resulting in better classification performance. The proposed framework is validated for fault diagnosis on the Case Western Reserve University bearing fault diagnosis dataset and the Paderborn University dataset. Since these datasets are already cleaned data, noisy vibration data are created by adding white noise for the demonstration of the robustness of the proposed framework. The vibration signals are first converted to images using the short-time Fourier transform with a fixed window size. The generated images are used to train and validate the proposed SeReViT. The results outperformed the state-of-the-art convolution-based models for fault diagnosis for both cleaned datasets and noisy datasets. The short-time Fourier transform is utilized to convert the noisy (raw) vibration signals from rotating machines to spectrum images.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"172 ","pages":"Article 112497"},"PeriodicalIF":7.6,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145220340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信