Computer Vision and Image Understanding最新文献

筛选
英文 中文
Brain tumor image segmentation based on shuffle transformer-dynamic convolution and inception dilated convolution
IF 4.3 3区 计算机科学
Computer Vision and Image Understanding Pub Date : 2025-03-01 DOI: 10.1016/j.cviu.2025.104324
Lifang Zhou , Ya Wang
{"title":"Brain tumor image segmentation based on shuffle transformer-dynamic convolution and inception dilated convolution","authors":"Lifang Zhou ,&nbsp;Ya Wang","doi":"10.1016/j.cviu.2025.104324","DOIUrl":"10.1016/j.cviu.2025.104324","url":null,"abstract":"<div><div>Accurate segmentation of brain tumors is essential for accurate clinical diagnosis and effective treatment. Convolutional neural networks (CNNs) have improved brain tumor segmentation with their excellent performance in local feature modeling. However, they still face the challenge of unpredictable changes in tumor size and location, because it cannot be effectively matched by CNN-based methods with local and regular receptive fields. To overcome these obstacles, we propose brain tumor image segmentation based on shuffle transformer-dynamic convolution and inception dilated convolution that captures and adapts different features of tumors through multi-scale feature extraction. Our model combines Shuffle Transformer-Dynamic Convolution (STDC) to capture both fine-grained and contextual image details so that it helps improve localization accuracy. In addition, the Inception Dilated Convolution(IDConv) module solves the problem of significant changes in the size of brain tumors, and then captures the information of different size of object. The multi-scale feature aggregation(MSFA) module integrates features from different encoder levels, which contributes to enriching the scale diversity of input patches and enhancing the robustness of segmentation. The experimental results conducted on the BraTS 2019, BraTS 2020, BraTS 2021, and MSD BTS datasets indicate that our model outperforms other state-of-the-art methods in terms of accuracy.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"254 ","pages":"Article 104324"},"PeriodicalIF":4.3,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143534227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient feature selection for pre-trained vision transformers
IF 4.3 3区 计算机科学
Computer Vision and Image Understanding Pub Date : 2025-03-01 DOI: 10.1016/j.cviu.2025.104326
Lan Huang , Jia Zeng , Mengqiang Yu , Weiping Ding , Xingyu Bai , Kangping Wang
{"title":"Efficient feature selection for pre-trained vision transformers","authors":"Lan Huang ,&nbsp;Jia Zeng ,&nbsp;Mengqiang Yu ,&nbsp;Weiping Ding ,&nbsp;Xingyu Bai ,&nbsp;Kangping Wang","doi":"10.1016/j.cviu.2025.104326","DOIUrl":"10.1016/j.cviu.2025.104326","url":null,"abstract":"<div><div>Handcrafted layer-wise vision transformers have demonstrated remarkable performance in image classification. However, their high computational cost limits their practical applications. In this paper, we first identify and highlight the data-independent feature redundancy in pre-trained Vision Transformer (ViT) models. Based on this observation, we explore the feasibility of searching for the best substructure within the original pre-trained model. To this end, we propose EffiSelecViT, a novel pruning method aimed at reducing the computational cost of ViTs while preserving their accuracy. EffiSelecViT introduces importance scores for both self-attention heads and Multi-Layer Perceptron (MLP) neurons in pre-trained ViT models. L1 regularization is applied to constrain and learn these scores. In this simple way, components that are crucial for model performance are assigned higher scores, while those with lower scores are identified as less important and subsequently pruned. Experimental results demonstrate that EffiSelecViT can prune DeiT-B to retain only 64% of FLOPs while maintaining accuracy. This efficiency-accuracy trade-off is consistent across various ViT architectures. Furthermore, qualitative analysis reveals enhanced information expression in the pruned models, affirming the effectiveness and practicality of EffiSelecViT. The code is available at <span><span>https://github.com/ZJ6789/EffiSelecViT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"254 ","pages":"Article 104326"},"PeriodicalIF":4.3,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143549732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Lifelong visible–infrared person re-identification via replay samples domain-modality-mix reconstruction and cross-domain cognitive network
IF 4.3 3区 计算机科学
Computer Vision and Image Understanding Pub Date : 2025-03-01 DOI: 10.1016/j.cviu.2025.104328
Xianyu Zhu , Guoqiang Xiao , Michael S. Lew , Song Wu
{"title":"Lifelong visible–infrared person re-identification via replay samples domain-modality-mix reconstruction and cross-domain cognitive network","authors":"Xianyu Zhu ,&nbsp;Guoqiang Xiao ,&nbsp;Michael S. Lew ,&nbsp;Song Wu","doi":"10.1016/j.cviu.2025.104328","DOIUrl":"10.1016/j.cviu.2025.104328","url":null,"abstract":"<div><div>Adapting statically-trained models to the incessant influx of data streams poses a pivotal research challenge. Concurrently, visible and infrared person re-identification (VI-ReID) offers an all-day surveillance mode to advance intelligent surveillance and elevate public safety precautions. Hence, we are pioneering a more fine-grained exploration of the lifelong VI-ReID task at the camera level, aiming to imbue the learned models with the capabilities of lifelong learning and memory within the continuous data streams. This task confronts dual challenges of cross-modality and cross-domain variations. Thus, in this paper, we proposed a Domain-Modality-Mix (DMM) based replay samples reconstruction strategy and Cross-domain Cognitive Network (CDCN) to address those challenges. Firstly, we establish an effective and expandable baseline model based on residual neural networks. Secondly, capitalizing on the unexploited potential knowledge of a memory bank that archives diverse replay samples, we enhance the anti-forgetting ability of our model by the Domain-Modality-Mix strategy, which devising a cross-domain, cross-modal image-level replay sample reconstruction, effectively alleviating catastrophic forgetting induced by modality and domain variations. Finally, guided by the Chunking Theory in cognitive psychology, we designed a Cross-domain Cognitive Network, which incorporates a camera-aware, expandable graph convolutional cognitive network to facilitate adaptive learning of intra-modal consistencies and cross-modal similarities within continuous cross-domain data streams. Extensive experiments demonstrate that our proposed method has remarkable adaptability and robust resistance to forgetting and outperforms multiple state-of-the-art methods in comparative assessments of the performance of LVI-ReID. The source code of our designed method is at <span><span>https://github.com/SWU-CS-MediaLab/DMM-CDCN</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"254 ","pages":"Article 104328"},"PeriodicalIF":4.3,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143561781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
View-to-label: Multi-view consistency for self-supervised monocular 3D object detection
IF 4.3 3区 计算机科学
Computer Vision and Image Understanding Pub Date : 2025-03-01 DOI: 10.1016/j.cviu.2025.104320
Issa Mouawad , Nikolas Brasch , Fabian Manhardt , Federico Tombari , Francesca Odone
{"title":"View-to-label: Multi-view consistency for self-supervised monocular 3D object detection","authors":"Issa Mouawad ,&nbsp;Nikolas Brasch ,&nbsp;Fabian Manhardt ,&nbsp;Federico Tombari ,&nbsp;Francesca Odone","doi":"10.1016/j.cviu.2025.104320","DOIUrl":"10.1016/j.cviu.2025.104320","url":null,"abstract":"<div><div>For autonomous vehicles, driving safely is highly dependent on the capability to correctly perceive the environment in the 3D space, hence the task of 3D object detection represents a fundamental aspect of perception. While 3D sensors deliver accurate metric perception, monocular approaches enjoy cost and availability advantages that are valuable in a wide range of applications. Unfortunately, training monocular methods requires a vast amount of annotated data. To compensate for this need, we propose a novel approach to self-supervise 3D object detection purely from RGB video sequences, leveraging geometric constraints and weak labels. Unlike other approaches that exploit additional sensors during training, <em>our method relies on the temporal continuity of video sequences.</em> A supervised pre-training on synthetic data produces initial plausible 3D boxes, then our geometric and photometrically grounded losses provide a strong self-supervision signal that allows the model to be fine-tuned on real data without labels.</div><div>Our experiments on Autonomous Driving benchmark datasets showcase the effectiveness and generality of our approach and the competitive performance compared to other self-supervised approaches.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"254 ","pages":"Article 104320"},"PeriodicalIF":4.3,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143519149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spatial and temporal beliefs for mistake detection in assembly tasks
IF 4.3 3区 计算机科学
Computer Vision and Image Understanding Pub Date : 2025-03-01 DOI: 10.1016/j.cviu.2025.104338
Guodong Ding , Fadime Sener , Shugao Ma , Angela Yao
{"title":"Spatial and temporal beliefs for mistake detection in assembly tasks","authors":"Guodong Ding ,&nbsp;Fadime Sener ,&nbsp;Shugao Ma ,&nbsp;Angela Yao","doi":"10.1016/j.cviu.2025.104338","DOIUrl":"10.1016/j.cviu.2025.104338","url":null,"abstract":"<div><div>Assembly tasks, as an integral part of daily routines and activities, involve a series of sequential steps that are prone to error. This paper proposes a novel method for identifying ordering mistakes in assembly tasks based on knowledge-grounded beliefs. The beliefs comprise spatial and temporal aspects, each serving a unique role. Spatial beliefs capture the structural relationships among assembly components and indicate their topological feasibility. Temporal beliefs model the action preconditions and enforce sequencing constraints. Furthermore, we introduce a learning algorithm that dynamically updates and augments the belief sets online. To evaluate, we first test our approach in deducing predefined rules on synthetic data based on industry assembly. We also verify our approach on the real-world Assembly101 dataset, enhanced with annotations of component information. Our framework achieves superior performance in detecting ordering mistakes under both synthetic and real-world settings, highlighting the effectiveness of our approach.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"254 ","pages":"Article 104338"},"PeriodicalIF":4.3,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143579726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incremental few-shot instance segmentation without fine-tuning on novel classes
IF 4.3 3区 计算机科学
Computer Vision and Image Understanding Pub Date : 2025-03-01 DOI: 10.1016/j.cviu.2025.104323
Luofeng Zhang, Libo Weng, Yuanming Zhang, Fei Gao
{"title":"Incremental few-shot instance segmentation without fine-tuning on novel classes","authors":"Luofeng Zhang,&nbsp;Libo Weng,&nbsp;Yuanming Zhang,&nbsp;Fei Gao","doi":"10.1016/j.cviu.2025.104323","DOIUrl":"10.1016/j.cviu.2025.104323","url":null,"abstract":"<div><div>Many current incremental few-shot object detection and instance segmentation methods necessitate fine-tuning on novel classes, which presents difficulties when training newly emerged classes on devices with limited computational power. In this paper, a finetune-free incremental few-shot instance segmentation method is proposed. Firstly, a novel weight generator (NWG) is proposed to map the embeddings of novel classes to their respective true centers. Then, the limitations of cosine similarity on novel classes with few samples are analyzed, and a simple yet effective improvement called the piecewise function for similarity calculation (PFSC) is proposed. Lastly, a probability dependency method (PD) is designed to mitigate the impact on the performance of base classes after registering novel classes. The comparative experimental results show that the proposed model outperforms existing finetune-free methods much more on MS COCO and VOC datasets, and registration of novel classes has almost no negative impact on the base classes. Therefore, the model exhibits excellent performance and the proposed finetune-free idea enables it to learn novel classes directly through inference on devices with limited computational power.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"254 ","pages":"Article 104323"},"PeriodicalIF":4.3,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143519147","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
When super-resolution meets camouflaged object detection: A comparison study
IF 4.3 3区 计算机科学
Computer Vision and Image Understanding Pub Date : 2025-02-21 DOI: 10.1016/j.cviu.2025.104321
Juan Wen , Shupeng Cheng , Weiyan Hou , Luc Van Gool , Radu Timofte
{"title":"When super-resolution meets camouflaged object detection: A comparison study","authors":"Juan Wen ,&nbsp;Shupeng Cheng ,&nbsp;Weiyan Hou ,&nbsp;Luc Van Gool ,&nbsp;Radu Timofte","doi":"10.1016/j.cviu.2025.104321","DOIUrl":"10.1016/j.cviu.2025.104321","url":null,"abstract":"<div><div>Super-resolution (SR) and camouflage object detection (COD) are two prominent topics in the field of computer vision, with various joint applications. However, in previous work, these two areas were often studied in isolation. In this paper, we conduct a comprehensive comparative evaluation of both for the first time. Specifically, we benchmark different super-resolution methods on commonly used COD datasets while also evaluating the robustness of different COD models using COD data processed by SR methods. Experiments reveal challenges in preserving semantic information due to differences in targets and features between the two domains. COD relies on extracting semantic information from low-resolution images to identify camouflage targets. There is a risk of losing or distorting important semantic details during the application of SR techniques. Balancing the enhancement of spatial resolution with the preservation of semantic information is crucial for maintaining the accuracy of COD algorithms. Therefore, we propose a new SR model called Dilated Super-resolution (DISR) to enhance SR performance on COD, achieving state-of-the-art results on five commonly used SR datasets. The Urban100 x4 dataset task improved by 0.38 dB. Using low-resolution images processed by DISR for COD tasks can enhance target visibility and significantly improve the performance of COD tasks. Our goal is to leverage the synergies between these two domains, draw insights from the complementarity of techniques in both fields, and provide insights and inspiration for future research in both communities.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"253 ","pages":"Article 104321"},"PeriodicalIF":4.3,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143479345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MultiFire20K: A semi-supervised enhanced large-scale UAV-based benchmark for advancing multi-task learning in fire monitoring
IF 4.3 3区 计算机科学
Computer Vision and Image Understanding Pub Date : 2025-02-19 DOI: 10.1016/j.cviu.2025.104318
Demetris Shianios, Panayiotis Kolios, Christos Kyrkou
{"title":"MultiFire20K: A semi-supervised enhanced large-scale UAV-based benchmark for advancing multi-task learning in fire monitoring","authors":"Demetris Shianios,&nbsp;Panayiotis Kolios,&nbsp;Christos Kyrkou","doi":"10.1016/j.cviu.2025.104318","DOIUrl":"10.1016/j.cviu.2025.104318","url":null,"abstract":"<div><div>Effective fire detection and response are crucial to minimizing the widespread damage and loss caused by fires in both urban and natural environments. While advancements in Computer Vision have enhanced fire detection and response, progress in UAV-based monitoring remains limited due to the lack of comprehensive datasets. This study introduces the <em>MultiFire20K</em> dataset, comprising 20,500 diverse aerial fire images with annotations for fire classification, environment classification, and separate segmentation masks for both fire and smoke, specifically designed to support multi-task learning. Due to limited labeled data in remote sensing, a semi-supervised approach for generating pseudo-labels for fire and smoke masks is explored which takes into consideration the environment of the event. We experimented with various segmentation architectures backbone models to generate reliable pseudo-label masks. Benchmarks were established by evaluating models on fire classification, environment classification, and the segmentation of both fire and smoke, and comparing these results to those obtained from multi-task models. Our study highlights the substantial advantages of a multi-task approach in fire monitoring, particularly in improving fire and smoke segmentation through shared knowledge during training. This enhanced efficiency, combined with the conservation of memory and computational resources, makes the multi-task framework superior for real-time applications, especially when compared to using separate models for each individual task. We anticipate that our dataset and benchmark results will encourage further research in fire surveillance, advancing fire detection and prevention methods.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"254 ","pages":"Article 104318"},"PeriodicalIF":4.3,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143487381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incremental few-shot instance segmentation via feature enhancement and prototype calibration
IF 4.3 3区 计算机科学
Computer Vision and Image Understanding Pub Date : 2025-02-12 DOI: 10.1016/j.cviu.2025.104317
Weixiang Gao , Caijuan Shi , Rui Wang , Ao Cai , Changyu Duan , Meiqin Liu
{"title":"Incremental few-shot instance segmentation via feature enhancement and prototype calibration","authors":"Weixiang Gao ,&nbsp;Caijuan Shi ,&nbsp;Rui Wang ,&nbsp;Ao Cai ,&nbsp;Changyu Duan ,&nbsp;Meiqin Liu","doi":"10.1016/j.cviu.2025.104317","DOIUrl":"10.1016/j.cviu.2025.104317","url":null,"abstract":"<div><div>Incremental few-shot instance segmentation (iFSIS) aims to detect and segment instances of novel classes with only a few training samples, while maintaining performance on base classes without revisiting base class data. iMTFA, a representative iFSIS method, offers a flexible approach for adding novel classes. Its key mechanism involves generating novel class weights by normalizing and averaging embeddings obtained from <span><math><mi>K</mi></math></span>-shot novel instances. However, relying on such a small sample size often leads to insufficient representation of the real class distribution, which in turn results in biased weights for the novel classes. Furthermore, due to the absence of novel fine-tuning, iMTFA tends to predict potential novel class foregrounds as background, which exacerbates the bias in the generated novel class weights. To overcome these limitations, we propose a simple but effective iFSIS method, named Enhancement and Calibration-based iMTFA (EC-iMTFA). Specifically, we first design an embedding enhancement and aggregation (EEA) module, which enhances the feature diversity of each novel instance embedding before generating novel class weights. We then design a novel prototype calibration (NPC) module that leverages the well-calibrated base class and background weights in the classifier to enhance the discriminability of novel class prototypes. In addition, a simple weight preprocessing (WP) mechanism is designed based on NPC to improve the calibration process further. Extensive experiments on COCO and VOC datasets demonstrate that EC-iMTFA outperforms iMTFA in terms of iFSIS and iFSOD performance, stability, and efficiency without requiring novel fine-tuning. Moreover, EC-iMTFA achieves competitive results compared to recent state-of-the-art methods.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"253 ","pages":"Article 104317"},"PeriodicalIF":4.3,"publicationDate":"2025-02-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143403545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cartoon character recognition based on portrait style fusion
IF 4.3 3区 计算机科学
Computer Vision and Image Understanding Pub Date : 2025-02-10 DOI: 10.1016/j.cviu.2025.104316
De Li , Zhenyi Jin , Xun Jin
{"title":"Cartoon character recognition based on portrait style fusion","authors":"De Li ,&nbsp;Zhenyi Jin ,&nbsp;Xun Jin","doi":"10.1016/j.cviu.2025.104316","DOIUrl":"10.1016/j.cviu.2025.104316","url":null,"abstract":"<div><div>In this paper, we propose a cartoon character recognition method using portrait characteristics to address the problem of copyright protection in cartoon works. The proposed recognition framework is derived from content-based retrieval mechanism, achieving an effective solution for copyright identification of cartoon characters. This research has two core contributions. The first is that we propose an ECA-based residual attention module to improve cartoon character feature learning ability. Cartoon character images typically have fewer details and texture information, and inter-channel information interaction can more effectively extract cartoon features. The second is a style transfer-based cartoon character construction mechanism, which is proposed to create a simulated plagiarized cartoon character dataset by fusing portrait style and content. Comparative experiments demonstrate that the proposed model effectively improves detection accuracy. Finally, we validate the effectiveness and feasibility of the model by retrieving plagiarized versions of cartoon characters.</div></div>","PeriodicalId":50633,"journal":{"name":"Computer Vision and Image Understanding","volume":"253 ","pages":"Article 104316"},"PeriodicalIF":4.3,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143427756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信