Pattern Recognition最新文献

筛选
英文 中文
Prompt-Ladder: Memory-efficient prompt tuning for vision-language models on edge devices
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-02-20 DOI: 10.1016/j.patcog.2025.111460
Siqi Cai , Xuan Liu , Jingling Yuan , Qihua Zhou
{"title":"Prompt-Ladder: Memory-efficient prompt tuning for vision-language models on edge devices","authors":"Siqi Cai ,&nbsp;Xuan Liu ,&nbsp;Jingling Yuan ,&nbsp;Qihua Zhou","doi":"10.1016/j.patcog.2025.111460","DOIUrl":"10.1016/j.patcog.2025.111460","url":null,"abstract":"<div><div>The pre-trained vision-language models (VLMs) have been the foundation for diverse intelligent services in human life. Common VLMs hold large parameter scales and require heavy memory overhead for model pre-training, which poses challenges in adapting them to edge devices. To enable memory-efficient VLMs, previous works mainly focus on the prompt engineering technique that utilizes trainable soft prompts instead of manually designing hard prompts. However, to update fewer than 3% of prompt parameters, these studies still require the back-propagation chain to traverse pre-trained models with extensive parameters. Consequently, the intermediate activation variables and gradients occupy a significant amount of memory resources, greatly hindering their adaptation on resource-constrained edge devices. In view of the above, we propose a memory-efficient prompt-tuning method, named <strong>Prompt-Ladder</strong>. Our main idea is to adopt a lightweight ladder network as an agent to bypass VLMs during back-propagation for the parameter optimization of the designed multi-model prompt module. The ladder network fuses the intermediate output of VLMs as a guide and selects important parameters of VLMs to initialize for the maintenance of model performance. We also share parameters of the ladder network between text and image data to obtain a more semantically aligned representation across modalities for the optimization of the prompt module. The experiments across seven datasets demonstrate that Prompt-Ladder can significantly reduce memory resource usage by at least 27% compared to baselines while maintaining relatively good performance.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"163 ","pages":"Article 111460"},"PeriodicalIF":7.5,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143464530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AMLCA: Additive multi-layer convolution-guided cross-attention network for visible and infrared image fusion
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-02-20 DOI: 10.1016/j.patcog.2025.111468
Dongliang Wang , Chuang Huang , Hao Pan , Yuan Sun , Jian Dai , Yanan Li , Zhenwen Ren
{"title":"AMLCA: Additive multi-layer convolution-guided cross-attention network for visible and infrared image fusion","authors":"Dongliang Wang ,&nbsp;Chuang Huang ,&nbsp;Hao Pan ,&nbsp;Yuan Sun ,&nbsp;Jian Dai ,&nbsp;Yanan Li ,&nbsp;Zhenwen Ren","doi":"10.1016/j.patcog.2025.111468","DOIUrl":"10.1016/j.patcog.2025.111468","url":null,"abstract":"<div><div>Multimodal image fusion is widely used in the processing of multispectral signals, <em>e</em>.<em>g</em>., visible and infrared images, which aims to create an information-rich fused image by combining the complementary information from different wavebands. Current fusion methods face significant challenges in extracting complementary information from sensors while simultaneously preserving local details and global dependencies. To address this challenge, we propose an additive multi-layer convolution-guided cross-attention network (AMLCA) for visible and infrared image fusion, which consists of two sub-modals, <em>i</em>.<em>e</em>., additive cross-attention module (ACAM) and wavelet convolution-guided transformer module (WCGTM). Specifically, the former enhances feature interaction and captures global holistic information by using an additive cross-attention mechanism, while the latter relies on wavelet convolution to guide the transformer, enhancing the preservation of details from both sources and improving the extraction of local detail information. Moreover, we propose a multi-layer fusion strategy that leverages hidden complementary features from various layers. Therefore, AMLCA can effectively extracts complementary information from local details and global dependencies, significantly enhancing overall performance. Extensive experiments and ablation analysis on public datasets demonstrate the superiority and effectiveness of AMLCA. The source code is available at <span><span>https://github.com/Wangdl2000/AMLCA-code</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"163 ","pages":"Article 111468"},"PeriodicalIF":7.5,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143488228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-Level Knowledge Distillation with Positional Encoding Enhancement
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-02-18 DOI: 10.1016/j.patcog.2025.111458
Lixiang Xu , Zhiwen Wang , Lu Bai , Shengwei Ji , Bing Ai , Xiaofeng Wang , Philip S. Yu
{"title":"Multi-Level Knowledge Distillation with Positional Encoding Enhancement","authors":"Lixiang Xu ,&nbsp;Zhiwen Wang ,&nbsp;Lu Bai ,&nbsp;Shengwei Ji ,&nbsp;Bing Ai ,&nbsp;Xiaofeng Wang ,&nbsp;Philip S. Yu","doi":"10.1016/j.patcog.2025.111458","DOIUrl":"10.1016/j.patcog.2025.111458","url":null,"abstract":"<div><div>In recent years, Graph Neural Networks (GNNs) have achieved substantial success in addressing graph-related tasks. Knowledge Distillation (KD) has increasingly been adopted in graph learning as a classical technique for model compression and acceleration, enabling the transfer of predictive power from trained GNN models to lightweight, easily deployable Multi-Layer Perceptron (MLP) models. However, this approach often neglects node positional features and relies solely on trained GNN-generated labels to train MLPs based on node content features. Moreover, it heavily depends on local information aggregation, making it challenging to capture global graph structure and thereby limiting performance in node classification tasks. To address this issue, we propose <strong>M</strong>ulti-<strong>L</strong>evel <strong>K</strong>nowledge <strong>D</strong>istillation with <strong>P</strong>ositional <strong>E</strong>ncoding Enhancement <strong>(MLKD-PE)</strong>. Our method employs positional encoding technique to generate node positional features, which are then combined with node content features to enhance the MLP’s ability to perceive node positions. Additionally, we introduce a multi-level KD technique that aligns the final output of the student model with the teacher model’s output, facilitating detailed knowledge transfer by incorporating intermediate layer outputs from the teacher model. Experimental results demonstrate that our method significantly improves classification accuracy across multiple datasets compared to the baseline model, confirming its superiority in node classification tasks.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"163 ","pages":"Article 111458"},"PeriodicalIF":7.5,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143444504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weakly supervised 3D human pose estimation based on PnP projection model 基于 PnP 投影模型的弱监督三维人体姿态估计
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-02-18 DOI: 10.1016/j.patcog.2025.111464
Xiaoyan Zhang , Yunlai Chen , Huaijing Lai , Hongzheng Zhang
{"title":"Weakly supervised 3D human pose estimation based on PnP projection model","authors":"Xiaoyan Zhang ,&nbsp;Yunlai Chen ,&nbsp;Huaijing Lai ,&nbsp;Hongzheng Zhang","doi":"10.1016/j.patcog.2025.111464","DOIUrl":"10.1016/j.patcog.2025.111464","url":null,"abstract":"<div><div>This paper describes a weakly supervised end-to-end model for estimating 3D human pose from a single image. The model is trained by reprojecting 3D poses to 2D poses for matching ground truth 2D poses for supervision, with minimal need for 3D labels. A mathematical camera model, utilizing intrinsic and extrinsic parameters, enables accurate reprojection and we use EPnP algorithm to estimate precise reprojection. While the uncertainty-aware PnP algorithm further improves the accuracy of estimated reprojection by considering the uncertainty of joint estimation. Further, an adversarial generative network, employing a Transformer-based encoder as generator, is used to predict 3D pose, which utilizes self-attention mechanism to establish dependencies between joints, and fuses features from an edge detection module and a 2D pose estimation module for constraint and spatial information. The model’s efficient reprojection method enables competitive results on Human3.6M and MPI-INF-3DHP, among weakly supervised methods, about 2.5% and 2.45% improvement respectively.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"163 ","pages":"Article 111464"},"PeriodicalIF":7.5,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143464529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CDN4: A cross-view Deep Nearest Neighbor Neural Network for fine-grained few-shot classification
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-02-18 DOI: 10.1016/j.patcog.2025.111466
Xiaoxu Li , Shuo Ding , Jiyang Xie , Xiaochen Yang , Zhanyu Ma , Jing-Hao Xue
{"title":"CDN4: A cross-view Deep Nearest Neighbor Neural Network for fine-grained few-shot classification","authors":"Xiaoxu Li ,&nbsp;Shuo Ding ,&nbsp;Jiyang Xie ,&nbsp;Xiaochen Yang ,&nbsp;Zhanyu Ma ,&nbsp;Jing-Hao Xue","doi":"10.1016/j.patcog.2025.111466","DOIUrl":"10.1016/j.patcog.2025.111466","url":null,"abstract":"<div><div>The fine-grained few-shot classification is a challenging task in computer vision, aiming to classify images with subtle and detailed differences given scarce labeled samples. A promising avenue to tackle this challenge is to use spatially local features to densely measure the similarity between query and support samples. Compared with image-level global features, local features contain more low-level information that is rich and transferable across categories. However, methods based on spatially localized features have difficulty distinguishing subtle category differences due to the lack of sample diversity. To address this issue, we propose a novel method called Cross-view Deep Nearest Neighbor Neural Network (CDN4). CDN4 applies a random geometric transformation to augment a different view of support and query samples and subsequently exploits four similarities between the original and transformed views of query local features and those views of support local features. The geometric augmentation increases the diversity between samples of the same class, and the cross-view measurement encourages the model to focus more on discriminative local features for classification through the cross-measurements between the two branches. Extensive experiments validate the superiority of CDN4, which achieves new state-of-the-art results in few-shot classification across various fine-grained benchmarks. Code is available at .</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"163 ","pages":"Article 111466"},"PeriodicalIF":7.5,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143464527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Preserving logical and functional dependencies in synthetic tabular data
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-02-18 DOI: 10.1016/j.patcog.2025.111459
Chaithra Umesh , Kristian Schultz , Manjunath Mahendra , Saptarshi Bej , Olaf Wolkenhauer
{"title":"Preserving logical and functional dependencies in synthetic tabular data","authors":"Chaithra Umesh ,&nbsp;Kristian Schultz ,&nbsp;Manjunath Mahendra ,&nbsp;Saptarshi Bej ,&nbsp;Olaf Wolkenhauer","doi":"10.1016/j.patcog.2025.111459","DOIUrl":"10.1016/j.patcog.2025.111459","url":null,"abstract":"<div><div>Dependencies among attributes are a common aspect of tabular data. However, whether existing tabular data generation algorithms preserve these dependencies while generating synthetic data is yet to be explored. In addition to the existing notion of functional dependencies, we introduce the notion of logical dependencies among the attributes in this article. Moreover, we provide a measure to quantify logical dependencies among attributes in tabular data. Utilizing this measure, we compare several state-of-the-art synthetic data generation algorithms and test their capability to preserve logical and functional dependencies on several publicly available datasets. We demonstrate that currently available synthetic tabular data generation algorithms do not fully preserve functional dependencies when they generate synthetic datasets. In addition, we also showed that some tabular synthetic data generation models can preserve inter-attribute logical dependencies. Our review and comparison of the state-of-the-art reveal research needs and opportunities to develop task-specific synthetic tabular data generation models.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"163 ","pages":"Article 111459"},"PeriodicalIF":7.5,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A complementary dual model for weakly supervised salient object detection
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-02-18 DOI: 10.1016/j.patcog.2025.111465
Liyuan Chen , Dawei Zhang , Xiao Wang , Chang Wan , Shan Jin , Zhonglong Zheng
{"title":"A complementary dual model for weakly supervised salient object detection","authors":"Liyuan Chen ,&nbsp;Dawei Zhang ,&nbsp;Xiao Wang ,&nbsp;Chang Wan ,&nbsp;Shan Jin ,&nbsp;Zhonglong Zheng","doi":"10.1016/j.patcog.2025.111465","DOIUrl":"10.1016/j.patcog.2025.111465","url":null,"abstract":"<div><div>Leveraging scribble annotations for weakly supervised salient object detection significantly reduces reliance on extensive, precise labels during model training. To optimize the use of these sparse annotations, we introduce a novel framework called the Complementary Reliable Region Aggregation Network (CRANet). This framework utilizes a dual-model framework that integrates complementary information from two models with the same architecture but different parameters: a foreground model that generates the saliency map and a background model that identifies regions excluding salient objects. By merging the outputs of both models, we propose a reliable pseudo-label aggregation strategy that expands the supervision capability of scribble annotations, eliminating the necessity for predefined thresholds and other parameterized modules. High-confidence predictions are then combined to create pseudo labels that guide the training process of both models. Additionally, we incorporate a flipping consistency method and a flipped guided loss function to enhance prediction consistency and increase the scale of the training set, effectively addressing the challenges posed by sparse and structurally constrained scribble annotations. Experimental results demonstrate that our approach significantly outperforms existing methods.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"163 ","pages":"Article 111465"},"PeriodicalIF":7.5,"publicationDate":"2025-02-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MSS-PAE: Saving Autoencoder-based Outlier Detection from Unexpected Reconstruction
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-02-17 DOI: 10.1016/j.patcog.2025.111467
Xu Tan , Jiawei Yang , Junqi Chen , Sylwan Rahardja , Susanto Rahardja
{"title":"MSS-PAE: Saving Autoencoder-based Outlier Detection from Unexpected Reconstruction","authors":"Xu Tan ,&nbsp;Jiawei Yang ,&nbsp;Junqi Chen ,&nbsp;Sylwan Rahardja ,&nbsp;Susanto Rahardja","doi":"10.1016/j.patcog.2025.111467","DOIUrl":"10.1016/j.patcog.2025.111467","url":null,"abstract":"<div><div>The Autoencoder (AE) is popular in Outlier Detection (OD) now due to its strong modeling ability. However, AE-based OD methods face the unexpected reconstruction problem: outliers are reconstructed with low errors, impeding their distinction from inliers. This stems from two aspects. First, AE may overconfidently produce good reconstructions in regions where outliers or potential outliers exist while using the mean squared error. To address this, the aleatoric uncertainty was introduced to construct the Probabilistic Autoencoder (PAE), and the Weighted Negative Log-Likelihood (WNLL) was proposed to enlarge the score disparity between inliers and outliers. Second, AE focuses on global modeling yet lacks the perception of local information. Therefore, the Mean-Shift Scoring (MSS) method was proposed to utilize the local relationship of data to reduce the false inliers caused by AE. Moreover, experiments on 32 real-world OD datasets proved the effectiveness of the proposed methods. The combination of WNLL and MSS achieved 45% relative performance improvement compared to the best baseline. In addition, MSS improved the detection performance of multiple AE-based outlier detectors by an average of 20%. The proposed methods have the potential to advance AE’s development in OD.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"163 ","pages":"Article 111467"},"PeriodicalIF":7.5,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SSDFusion: A scene-semantic decomposition approach for visible and infrared image fusion
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-02-17 DOI: 10.1016/j.patcog.2025.111457
Rui Ming , Yixian Xiao , Xinyu Liu , Guolong Zheng , Guobao Xiao
{"title":"SSDFusion: A scene-semantic decomposition approach for visible and infrared image fusion","authors":"Rui Ming ,&nbsp;Yixian Xiao ,&nbsp;Xinyu Liu ,&nbsp;Guolong Zheng ,&nbsp;Guobao Xiao","doi":"10.1016/j.patcog.2025.111457","DOIUrl":"10.1016/j.patcog.2025.111457","url":null,"abstract":"<div><div>Visible and infrared image fusion aims to generate fused images with comprehensive scene understanding and detailed contextual information. However, existing methods often struggle to adequately handle relationships between different modalities and optimize for downstream applications. To address these challenges, we propose a novel scene-semantic decomposition-based approach for visible and infrared image fusion, termed <em>SSDFusion</em>. Our method employs a multi-level encoder-fusion network with fusion modules implementing the proposed scene-semantic decomposition and fusion strategy to extract and fuse scene-related and semantic-related components, respectively, and inject the fused semantics into scene features, enriching the contextual information in fused features while sustaining fidelity of fused images. Moreover, we further incorporate meta-feature embedding to connect the encoder-fusion network with the downstream application network during the training process, enhancing our method’s ability to extract semantics, optimize the fusion effect, and serve tasks such as semantic segmentation. Extensive experiments demonstrate that SSDFusion achieves state-of-the-art image fusion performance while enhancing results on semantic segmentation tasks. Our approach bridges the gap between feature decomposition-based image fusion and high-level vision applications, providing a more effective paradigm for multi-modal image fusion. The code is available at https://github.com/YiXian-Xiao/SSDFusion.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"163 ","pages":"Article 111457"},"PeriodicalIF":7.5,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143454226","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Edge-guided 3D reconstruction from multi-view sketches and RGB images 根据多视角草图和 RGB 图像进行边缘引导 3D 重建
IF 7.5 1区 计算机科学
Pattern Recognition Pub Date : 2025-02-17 DOI: 10.1016/j.patcog.2025.111462
Wuzhen Shi, Aixue Yin, Yingxiang Li, Yang Wen
{"title":"Edge-guided 3D reconstruction from multi-view sketches and RGB images","authors":"Wuzhen Shi,&nbsp;Aixue Yin,&nbsp;Yingxiang Li,&nbsp;Yang Wen","doi":"10.1016/j.patcog.2025.111462","DOIUrl":"10.1016/j.patcog.2025.111462","url":null,"abstract":"<div><div>Considering that edge maps can well reflect the structure of objects, we novelly propose to train an end-to-end network to reconstruct 3D models from edge maps. Since edge maps can be easily extracted from RGB images and sketches, our edge-based 3D reconstruction network (EBNet) can be used to reconstruct 3D models from both RGB images and sketches. In order to utilize both the texture and edge information of the image to obtain better 3D reconstruction results, we further propose an edge-guided 3D reconstruction network (EGNet), which enhances the perception of structures by edge information to improve the performance of the reconstructed 3D model. Although sketches have less texture information compared to RGB images, experiments show that our EGNet can also help improve the performance of reconstructing 3D models from sketches. To exploit the complementary information among different viewpoints, we further propose a multi-view edge-guided 3D reconstruction network (MEGNet) with a structure-aware fusion module. To the best of our knowledge, we are the first to use edge maps to enhance structural information for multi-view 3D reconstruction. Experimental results on the ShapeNet, Synthetic-LineDrawing benchmarks show that the proposed method outperforms the state-of-the-art methods for reconstructing 3D models from both RGB images and sketches. Ablation studies demonstrate the effectiveness of the proposed different modules.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"163 ","pages":"Article 111462"},"PeriodicalIF":7.5,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143464531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信