Pattern Recognition最新文献

筛选
英文 中文
SAM-Net: Semantic-assisted multimodal network for action recognition in RGB-D videos SAM-Net:用于RGB-D视频动作识别的语义辅助多模态网络
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-05-15 DOI: 10.1016/j.patcog.2025.111725
Dan Liu , Fanrong Meng , Jinpeng Mi , Mao Ye , Qingdu Li , Jianwei Zhang
{"title":"SAM-Net: Semantic-assisted multimodal network for action recognition in RGB-D videos","authors":"Dan Liu ,&nbsp;Fanrong Meng ,&nbsp;Jinpeng Mi ,&nbsp;Mao Ye ,&nbsp;Qingdu Li ,&nbsp;Jianwei Zhang","doi":"10.1016/j.patcog.2025.111725","DOIUrl":"10.1016/j.patcog.2025.111725","url":null,"abstract":"<div><div>The advent of affordable depth sensors has driven extensive research on human action recognition (HAR) in RGB-D videos. Existing unimodal approaches, such as skeleton-based or RGB video-based methods, have inherent limitations. For instance, the skeleton modality lacks spatial interaction, while the RGB video modality is highly susceptible to environmental noise. Additionally, multimodal action recognition often faces issues like insufficient data fusion and a substantial computational burden for temporal modeling. In this paper, we present an innovative Semantic-Assisted Multimodal Network (SAM-Net) for HAR in RGB-D videos. Firstly, we skillfully generate a SpatioTemporal Dynamic Region (STDR) image to instead of the RGB video modality by leveraging skeleton modality, thereby significantly reducing the video volume. Subsequently, we explore semantic information from large-scale VLMs, which effectively facilitates multimodal adaptation learning. Moreover, we implement an intramodal and intermodal multi-level fusion process for HAR. Finally, through extensive testing on three challenging datasets, our proposed SAM-Net showcases consistent state-of-the-art performance across various experimental configurations. Our codes will be released at <span><span>https://github.com/2233950316/code</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"168 ","pages":"Article 111725"},"PeriodicalIF":7.5,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144070372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sparse Bayesian learning for dynamical modelling on product manifolds 稀疏贝叶斯学习在积流形上的动态建模
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-05-14 DOI: 10.1016/j.patcog.2025.111708
Chao Tan , Huan Zhao , Han Ding
{"title":"Sparse Bayesian learning for dynamical modelling on product manifolds","authors":"Chao Tan ,&nbsp;Huan Zhao ,&nbsp;Han Ding","doi":"10.1016/j.patcog.2025.111708","DOIUrl":"10.1016/j.patcog.2025.111708","url":null,"abstract":"<div><div>In imitation learning, Bayesian approaches are widely applied for encoding robotic skills. However, most existing works focus on tasks represented in Euclidean spaces, which cannot properly characterize non-Euclidean behaviours such as robot orientation. In this paper, we propose an intrinsic Bayesian scheme for learning dynamical models on product manifolds, enabling effective learning of pose-related tasks. First, an intrinsic weighted-metric is presented for statistical analysis on product manifolds. Its validity is rigorously proven by satisfying metric axioms. Then, to decouple the constrained multi-output system without increasing computational complexity, a manifold dynamical model of the sub-system is proposed using parallel transport across local charts, ensuring geometric consistency. After that, the manifold Gaussian process is developed by incorporating the intrinsic weighted metric, significantly improving regression accuracy. But the computational complexity of this approach is constrained by the size of the covariance matrix, particularly for large datasets. To further enhance the calculation efficiency, manifold sparse Bayesian learning is proposed by considering sparse priors. Finally, simulations and experimental studies show the effectiveness and accuracy of the proposed Bayesian scheme on product manifolds.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"168 ","pages":"Article 111708"},"PeriodicalIF":7.5,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144070384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Long and Short-Term Collaborative Decision-Making Transformer for Online Action Detection and Anticipation 在线动作检测与预测的长短期协同决策转换器
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-05-14 DOI: 10.1016/j.patcog.2025.111773
Sensen Wang , Chi Zhang , Le Wang , Yuehu Liu
{"title":"Long and Short-Term Collaborative Decision-Making Transformer for Online Action Detection and Anticipation","authors":"Sensen Wang ,&nbsp;Chi Zhang ,&nbsp;Le Wang ,&nbsp;Yuehu Liu","doi":"10.1016/j.patcog.2025.111773","DOIUrl":"10.1016/j.patcog.2025.111773","url":null,"abstract":"<div><div>Online Action Detection (OAD) and Online Action Anticipation (OAA) are all based on recognizing recent actions in historical videos without utilizing any future information. Existing methods use remote videos to obtain additional visual clues. However, remote videos may also include irrelevant videos that could attract attention, causing misinterpretation of recent actions. To this end, we propose a dual-path collaborative decision-making framework, which integrates one path that exclusively accesses recent videos with another path that can access both recent and remote videos, enabling it to correct low-confidence results misled by irrelevant remote content. On this basis, we propose a unified model for OAD and OAA, named <em>Collaborative Decision-Making Transformer</em> (<em>CDM-Tr</em>), which includes (1) a long-term history-based <em>LT-Path</em> that utilizes remote videos to assist in recognizing actions in recent videos, (2) a short-term history-based <em>ST-Path</em> that relies only on recent videos to recognize recent actions, and (3) a <em>Multi-Task Classifier</em> that makes collaborative decisions based on the weighted summation of these two paths. <em>CDM-Tr</em> achieves state-of-the-art performance on THUMOS’14 (OAD:72.6%, OAA:59.2%) and TVSeries (OAD:89.8%, OAA:84.2%). Meanwhile, the effectiveness and flexibility of the collaborative decision-making framework are further demonstrated.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"168 ","pages":"Article 111773"},"PeriodicalIF":7.5,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144070373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Regional Type II multivariate Laplace descriptor based on Lie group 基于李群的区域II型多元拉普拉斯描述子
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-05-14 DOI: 10.1016/j.patcog.2025.111776
Dengfeng Liao , Guangzhong Liu , Hengda Wang
{"title":"Regional Type II multivariate Laplace descriptor based on Lie group","authors":"Dengfeng Liao ,&nbsp;Guangzhong Liu ,&nbsp;Hengda Wang","doi":"10.1016/j.patcog.2025.111776","DOIUrl":"10.1016/j.patcog.2025.111776","url":null,"abstract":"<div><div>Feature descriptors play a pivotal role in image classification and target detection. This paper introduces three categories of Type II multivariate Laplace region image descriptors based on Lie group theory. Through Lie group theory, we have demonstrated that the Laplace distribution function space is a unique type of Riemannian manifold, specifically a Lie group. Subsequently, we have proven the equivalence between two categories of partitions obtained through isomorphic mapping, leading to the left (or right) coset. Following this, the left (or right) polar decomposition leads to the symmetric positive definite matrix Lie group. Finally, based on the homeomorphic mapping, we obtain the feature descriptor on the Lie algebra at the mean <span><math><mi>μ</mi></math></span> of the embedded matrix. The Laplace descriptors are constructed by selecting <span><math><mi>d</mi></math></span> low-level or mid-level original features on each pixel. This method is able to handle low-dimensional or high-dimensional features based on actual requirements more effectively. We have conducted image classification experiments on two benchmark datasets and carried out ship target detection tasks on a public naval image set to validate the effectiveness of the Laplace region image descriptors. The results have demonstrated a certain degree of expressiveness and universality, offering a novel method for image information extraction.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"167 ","pages":"Article 111776"},"PeriodicalIF":7.5,"publicationDate":"2025-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143947934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CIEG-Net: Context Information Enhanced Gated Network for multimodal sentiment analysis 面向多模态情感分析的上下文信息增强门控网络
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-05-13 DOI: 10.1016/j.patcog.2025.111785
Zhongyuan Chen, Chong Lu, Yihan Wang
{"title":"CIEG-Net: Context Information Enhanced Gated Network for multimodal sentiment analysis","authors":"Zhongyuan Chen,&nbsp;Chong Lu,&nbsp;Yihan Wang","doi":"10.1016/j.patcog.2025.111785","DOIUrl":"10.1016/j.patcog.2025.111785","url":null,"abstract":"<div><div>Multimodal sentiment analysis is a widely studied field aimed at recognizing sentiment information through multiple modalities. The primary challenge in this field lies in developing high-quality fusion frameworks that effectively address the heterogeneity among different modalities and the issue of feature loss during the fusion process. However, existing research has primarily focused on cross-modal fusion, with relatively little attention paid to the sentiment semantics conveyed by context information. In this paper, we propose the Context Information Enhanced Gated Network (CIEG-Net), a novel fusion network that enhances multimodal fusion by incorporating context information from the input modalities. Specifically, we first construct a context information enhanced module to obtain the input and corresponding context information for the text and audio modalities. Then, we designed a fusion network module that facilitates the fusion between the text–audio modality and their respective text-context and audio-context information. Finally, we propose a gated network module that dynamically adjusts the weights of each modality and its context information, further strengthening multimodal fusion and attempting to recover missing features. We evaluate the proposed model on three publicly available multimodal sentiment analysis datasets: CMU-MOSI, CMU-MOSEI, and CH-SIMS. Experimental results show that our model significantly outperforms the current SOTA models.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"168 ","pages":"Article 111785"},"PeriodicalIF":7.5,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143947567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SDIP: Self-reinforcement deep image prior framework for image processing 用于图像处理的自强化深度图像先验框架
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-05-13 DOI: 10.1016/j.patcog.2025.111786
Ziyu Shu , Zhixin Pan
{"title":"SDIP: Self-reinforcement deep image prior framework for image processing","authors":"Ziyu Shu ,&nbsp;Zhixin Pan","doi":"10.1016/j.patcog.2025.111786","DOIUrl":"10.1016/j.patcog.2025.111786","url":null,"abstract":"<div><div>Deep image prior (DIP) proposed in recent research has revealed the inherent trait of convolutional neural networks (CNN) for capturing substantial low-level image statistics priors. This framework efficiently addresses the inverse problems in image processing and has induced extensive applications in various domains. In this paper, we propose the self-reinforcement deep image prior (SDIP) as an improved version of the original DIP. We observed that the changes in the DIP networks’ input and output are highly correlated during each iteration. SDIP efficiently utilizes this discovery in a reinforcement learning manner, where the current iteration’s output is utilized by a steering algorithm to update the network input for the next iteration, guiding the algorithm towards improved results. Experimental results across multiple applications demonstrate that our proposed SDIP framework offers improvement compared to the original DIP method, especially when the corresponding inverse problem is highly ill-posed.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"168 ","pages":"Article 111786"},"PeriodicalIF":7.5,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144070374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
YOLOv8-MAH: Multi-attribute recognition model for Vehicles YOLOv8-MAH:车辆多属性识别模型
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-05-13 DOI: 10.1016/j.patcog.2025.111849
Yazhou Zhao , Hongdong Zhao , Jianfeng Shi
{"title":"YOLOv8-MAH: Multi-attribute recognition model for Vehicles","authors":"Yazhou Zhao ,&nbsp;Hongdong Zhao ,&nbsp;Jianfeng Shi","doi":"10.1016/j.patcog.2025.111849","DOIUrl":"10.1016/j.patcog.2025.111849","url":null,"abstract":"<div><div>Vehicle multi-attribute recognition tasks have been increasingly used in intelligent traffic management, but the intra-class variability and inter-class similarity among vehicles bring great difficulties to vehicle multi-attribute recognition. To address this challenge, this paper proposes an improved model named YOLOv8-MAH (YOLOv8 Multi-Attribute-Head), which aims to enhance the performance of multi-attribute recognition. In order to utilize the ability of transformer encoder to accurately obtain detailed information, the C2f (CSP Bottleneck 2 Convolution) module in the backbone network is replaced by the global channel module of MobileViT, at the same time the C2f-E module based on the EMA architecture (Efficient Multi-scale Attention) is designed to improve the ability of the network to recognize different attributes, and we also add an additional detection layer to better extract information from the detailed part of the image to identify more attributes. Furthermore, our self-made dataset is labeled in three perspectives: vehicle brand, color, and direction, and is divided into 144 categories. The experiment results show that the YOLOv8-MAH significantly achieves good performance in the vehicle multi-recognition task.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"167 ","pages":"Article 111849"},"PeriodicalIF":7.5,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144067975","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ST-CGNet: A spatiotemporal gesture recognition network with triplet attention and dual feature fusion ST-CGNet:基于三重注意和双特征融合的时空手势识别网络
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-05-12 DOI: 10.1016/j.patcog.2025.111767
Jing Hu, Songtao Liu, Mingzhou Liu, Tingyu Zhou, Jiale Lu, Xingyan Zuo
{"title":"ST-CGNet: A spatiotemporal gesture recognition network with triplet attention and dual feature fusion","authors":"Jing Hu,&nbsp;Songtao Liu,&nbsp;Mingzhou Liu,&nbsp;Tingyu Zhou,&nbsp;Jiale Lu,&nbsp;Xingyan Zuo","doi":"10.1016/j.patcog.2025.111767","DOIUrl":"10.1016/j.patcog.2025.111767","url":null,"abstract":"<div><div>Gesture recognition, as a critical area in human–computer interaction, faces significant challenges in modeling complex spatiotemporal dynamics and adapting to gesture diversity. This paper proposes a novel framework—ST-CGNet, which captures multi-scale spatiotemporal features by integrating an optimized C3D network with a lightweight GatedConvLSTM. The C3D module focuses on short-term spatiotemporal feature extraction, while the GatedConvLSTM captures long-term dependencies through a gating mechanism. To enhance sensitivity to dynamic variations in gestures, a TripletAttention3D module is introduced, which strengthens the model’s ability to focus on salient motion patterns. Additionally, an adaptive fusion strategy is employed to dynamically weight and integrate features from both branches, improving performance across diverse gesture types. Experiments on the Jester and EgoGesture datasets demonstrate that the proposed method significantly outperforms baseline models in terms of recognition accuracy and generalization, particularly in handling complex gesture sequences. These results highlight the effectiveness of the proposed approach as a promising solution for dynamic gesture recognition.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"167 ","pages":"Article 111767"},"PeriodicalIF":7.5,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143947936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Voxel Pillar Multi-frame Cross Attention Network for sparse point cloud robust single object tracking
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-05-12 DOI: 10.1016/j.patcog.2025.111771
Luda Zhao , Yihua Hu , Xing Yang , Yicheng Wang , Zhenglei Dou , Yan Zhang
{"title":"Voxel Pillar Multi-frame Cross Attention Network for sparse point cloud robust single object tracking","authors":"Luda Zhao ,&nbsp;Yihua Hu ,&nbsp;Xing Yang ,&nbsp;Yicheng Wang ,&nbsp;Zhenglei Dou ,&nbsp;Yan Zhang","doi":"10.1016/j.patcog.2025.111771","DOIUrl":"10.1016/j.patcog.2025.111771","url":null,"abstract":"<div><div>Single object tracking (SOT) within dynamic point cloud sequences is critically important in autonomous driving, remote sensing navigation, and smart industrial applications, etc. Point cloud collected via various LiDAR becomes sparse due to sensor-related and environmental disturbances, leading to tracking inaccuracies driven by the limited robustness of existing SOT algorithms. To mitigate these challenges, we propose a Voxel Pillar Multi-frame Cross Attention Network (VPMCAN) designed for sparse point cloud robust tracking. VPMCAN employs a voxel-based encoding of pillar information for feature extraction and utilizes a dense pyramid network for the extraction of multi-scale sparse data. The integration of multi-frame and cross-attention mechanisms during feature fusion allows for an effective balance between global and local features, significantly enhancing the target’s long-term tracking robustness. Additionally, VPMCAN’s design prioritizes lightweight architecture, to ensure hardware-friendly implementation. To showcase its efficacy, we constructed a maritime point cloud video sequences dataset and conducted extensive experiments across KITTI, nuScenes and Waymo datasets. Results reveal VPMCAN’s optimal performance in non-sparse scenes and a remarkable 32.5% improvement over state-of-the-art algorithms in sparse scenes, averaging over a 20% performance increase. This highlights the efficacy of the lightweight point cloud SOT algorithm in robustly tracking sparse targets, suggesting promising practical applications.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"167 ","pages":"Article 111771"},"PeriodicalIF":7.5,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143943487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Encoding sampling pattern for robust and generalized MRI reconstruction 鲁棒和广义MRI重构的编码采样模式
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-05-11 DOI: 10.1016/j.patcog.2025.111772
Lina Sun , Hong Wang , Qi Xie , Yefeng Zheng , Deyu Meng
{"title":"Encoding sampling pattern for robust and generalized MRI reconstruction","authors":"Lina Sun ,&nbsp;Hong Wang ,&nbsp;Qi Xie ,&nbsp;Yefeng Zheng ,&nbsp;Deyu Meng","doi":"10.1016/j.patcog.2025.111772","DOIUrl":"10.1016/j.patcog.2025.111772","url":null,"abstract":"<div><div>Against the magnetic resonance imaging (MRI) reconstruction task, current deep learning based methods have achieved promising performance. Nevertheless, most of them are confronted with two main problems: (1) For most current MRI reconstruction methods, the down-sampling pattern is generally preset in advance, which makes it hard to flexibly handle the complicated real scenarios where the training data and the testing data are obtained under different sampling settings, thus constraining the model generalization capability. (2) They have not fully incorporated the physical imaging mechanism between the down-sampling pattern estimation and high-resolution MRI reconstruction into deep network design for this specific task. To alleviate these issues, we propose a model-driven MRI reconstruction network called MXNet, which considers the relationship between the undersampling pattern and imaging by encoding the mask into the network. Specifically, based on the MR physical imaging process, we first jointly optimize the down-sampling pattern and MRI reconstruction network. Then, based on the proposed optimization algorithm and the deep unfolding technique, we correspondingly construct the deep network where the physical imaging mechanism for MRI reconstruction is fully embedded into the entire learning process. Based on different settings between training data and testing data, with both consistent and inconsistent down-sampling patterns, extensive experiments comprehensively substantiate the effectiveness of our proposed MXNet in detail reconstruction as well as its fine generality. Moreover, we provide detailed model analysis and validate that our proposed framework shows fine generality and it can still accomplish superior performance when the downsampling mask is accurately available. The code is available at <span><span>https://github.com/sunliyangna0705/MXNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"168 ","pages":"Article 111772"},"PeriodicalIF":7.5,"publicationDate":"2025-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144070375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信