IEEE Transactions on Multimedia最新文献

筛选
英文 中文
Improving Network Interpretability via Explanation Consistency Evaluation 通过解释一致性评估提高网络可解释性
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-09-16 DOI: 10.1109/TMM.2024.3453058
Hefeng Wu;Hao Jiang;Keze Wang;Ziyi Tang;Xianghuan He;Liang Lin
{"title":"Improving Network Interpretability via Explanation Consistency Evaluation","authors":"Hefeng Wu;Hao Jiang;Keze Wang;Ziyi Tang;Xianghuan He;Liang Lin","doi":"10.1109/TMM.2024.3453058","DOIUrl":"https://doi.org/10.1109/TMM.2024.3453058","url":null,"abstract":"While deep neural networks have achieved remarkable performance, they tend to lack transparency in prediction. The pursuit of greater interpretability in neural networks often results in a degradation of their original performance. Some works strive to improve both interpretability and performance, but they primarily depend on meticulously imposed conditions. In this paper, we propose a simple yet effective framework that acquires more explainable activation heatmaps and simultaneously increases the model performance, without the need for any extra supervision. Specifically, our concise framework introduces a new metric, i.e., explanation consistency, to reweight the training samples adaptively in model learning. The explanation consistency metric is utilized to measure the similarity between the model's visual explanations of the original samples and those of semantic-preserved adversarial samples, whose background regions are perturbed by using image adversarial attack techniques. Our framework then promotes the model learning by paying closer attention to those training samples with a high difference in explanations (i.e., low explanation consistency), for which the current model cannot provide robust interpretations. Comprehensive experimental results on various benchmarks demonstrate the superiority of our framework in multiple aspects, including higher recognition accuracy, greater data debiasing capability, stronger network robustness, and more precise localization ability on both regular networks and interpretable networks. We also provide extensive ablation studies and qualitative analyses to unveil the detailed contribution of each component.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"11261-11273"},"PeriodicalIF":8.4,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142691775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Mutual Distillation for Unsupervised Domain Adaptation Person Re-identification 用于无监督领域适应性人员再识别的深度相互提炼技术
IF 7.3 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-09-12 DOI: 10.1109/tmm.2024.3459637
Xingyu Gao, Zhenyu Chen, Jianze Wei, Rubo Wang, Zhijun Zhao
{"title":"Deep Mutual Distillation for Unsupervised Domain Adaptation Person Re-identification","authors":"Xingyu Gao, Zhenyu Chen, Jianze Wei, Rubo Wang, Zhijun Zhao","doi":"10.1109/tmm.2024.3459637","DOIUrl":"https://doi.org/10.1109/tmm.2024.3459637","url":null,"abstract":"","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"1 1","pages":""},"PeriodicalIF":7.3,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142178712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Collaborative License Plate Recognition via Association Enhancement Network With Auxiliary Learning and a Unified Benchmark 借助辅助学习和统一基准,通过关联增强网络实现协作式车牌识别
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-09-10 DOI: 10.1109/TMM.2024.3452982
Yifei Deng;Guohao Wang;Chenglong Li;Wei Wang;Cheng Zhang;Jin Tang
{"title":"Collaborative License Plate Recognition via Association Enhancement Network With Auxiliary Learning and a Unified Benchmark","authors":"Yifei Deng;Guohao Wang;Chenglong Li;Wei Wang;Cheng Zhang;Jin Tang","doi":"10.1109/TMM.2024.3452982","DOIUrl":"10.1109/TMM.2024.3452982","url":null,"abstract":"Since the standard license plate of large vehicle is easily affected by occlusion and stain, the traffic management department introduces the enlarged license plate at the rear of the large vehicle to assist license plate recognition. However, current researches regards standard license plate recognition and enlarged license plate recognition as independent tasks, and do not take advantage of the complementary benefits from the two types of license plates. In this work, we propose a new computer vision task called collaborative license plate recognition, aiming to leverage the complementary advantages of standard and enlarged license plates for achieving more accurate license plate recognition. To achieve this goal, we propose an Association Enhancement Network (AENet), which achieves robust collaborative licence plate recognition by capturing the correlations between characters within a single licence plate and enhancing the associations between two license plates. In particular, we design an association enhancement branch, which supervises the fusion of two licence plate information using the complete licence plate number to mine the association between them. To enhance the representation ability of each type of licence plates, we design an auxiliary learning branch in the training stage, which supervises the learning of individual license plates in the association enhancement between two license plates. In addition, we contribute a comprehensive benchmark dataset called CLPR, which consists of a total of 19,782 standard and enlarged licence plates from 24 provinces in China and covers most of the challenges in real scenarios, for collaborative license plate recognition. Extensive experiments on the proposed CLPR dataset demonstrate the effectiveness of the proposed AENet against several state-of-the-art methods.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"11402-11414"},"PeriodicalIF":8.4,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142178713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VLDadaptor: Domain Adaptive Object Detection With Vision-Language Model Distillation VLDadaptor:通过视觉语言模型提炼实现领域自适应目标检测
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-09-06 DOI: 10.1109/TMM.2024.3453061
Junjie Ke;Lihuo He;Bo Han;Jie Li;Di Wang;Xinbo Gao
{"title":"VLDadaptor: Domain Adaptive Object Detection With Vision-Language Model Distillation","authors":"Junjie Ke;Lihuo He;Bo Han;Jie Li;Di Wang;Xinbo Gao","doi":"10.1109/TMM.2024.3453061","DOIUrl":"10.1109/TMM.2024.3453061","url":null,"abstract":"Domain adaptive object detection (DAOD) aims to develop a detector trained on labeled source domains to identify objects in unlabeled target domains. A primary challenge in DAOD is the domain shift problem. Most existing methods learn domain-invariant features within single domain embedding space, often resulting in heavy model biases due to the intrinsic data properties of source domains. To mitigate the model biases, this paper proposes VLDadaptor, a domain adaptive object detector based on vision-language models (VLMs) distillation. Firstly, the proposed method integrates domain-mixed contrastive knowledge distillation between the visual encoder of CLIP and the detector by transferring category-level instance features, which guarantees the detector can extract domain-invariant visual instance features across domains. Then, VLDadaptor employs domain-mixed consistency distillation between the text encoder of CLIP and detector by aligning text prompt embeddings with visual instance features, which helps to maintain the category-level feature consistency among the detector, text encoder and the visual encoder of VLMs. Finally, the proposed method further promotes the adaptation ability by adopting a prompt-based memory bank to generate semantic-complete features for graph matching. These contributions enable VLDadaptor to extract visual features into the visual-language embedding space without any evident model bias towards specific domains. Extensive experimental results demonstrate that the proposed method achieves state-of-the-art performance on Pascal VOC to Clipart adaptation tasks and exhibits high accuracy on driving scenario tasks with significantly less training time.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"11316-11331"},"PeriodicalIF":8.4,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142178588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Camera-Incremental Object Re-Identification With Identity Knowledge Evolution 利用身份知识演进进行相机增量物体再识别
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-09-05 DOI: 10.1109/TMM.2024.3453045
Hantao Yao;Jifei Luo;Lu Yu;Changsheng Xu
{"title":"Camera-Incremental Object Re-Identification With Identity Knowledge Evolution","authors":"Hantao Yao;Jifei Luo;Lu Yu;Changsheng Xu","doi":"10.1109/TMM.2024.3453045","DOIUrl":"10.1109/TMM.2024.3453045","url":null,"abstract":"Object Re-identification (ReID) is a task focused on retrieving a probe object from a multitude of gallery images using a ReID model trained on a stationary, camera-free dataset. This training involves associating and aggregating identities across various camera views. However, when deploying ReID algorithms in real-world scenarios, several challenges, such as storage constraints, privacy considerations, and dynamic changes in camera setups, can hinder their generalizability and practicality. To address these challenges, we introduce a novel ReID task called Camera-Incremental Object Re-identification (CIOR). In CIOR, we treat each camera's data as a separate source and continually optimize the ReID model as new data streams come from various cameras. By associating and consolidating the knowledge of common identities, our aim is to enhance discrimination capabilities and mitigate the problem of catastrophic forgetting. Therefore, we propose a novel Identity Knowledge Evolution (IKE) framework for CIOR, consisting of Identity Knowledge Association (IKA), Identity Knowledge Distillation (IKD), and Identity Knowledge Update (IKU). IKA is proposed to discover common identities between the current identity and historical identities, facilitating the integration of previously acquired knowledge. IKD involves distilling historical identity knowledge from common identities, enabling rapid adaptation of the historical model to the current camera view. After each camera has been trained, IKU is applied to continually expand identity knowledge by combining historical and current identity memories. Market-CL and Veri-CL evaluations show the effectiveness of Identity Knowledge Evolution (IKE) for CIOR.Code: \u0000<uri>https://github.com/htyao89/Camera-Incremental-Object-ReID</uri>","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"11246-11260"},"PeriodicalIF":8.4,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142178715","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dual-View Data Hallucination With Semantic Relation Guidance for Few-Shot Image Recognition 利用语义关系指导双视图数据幻象,实现少镜头图像识别
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-09-02 DOI: 10.1109/TMM.2024.3453055
Hefeng Wu;Guangzhi Ye;Ziyang Zhou;Ling Tian;Qing Wang;Liang Lin
{"title":"Dual-View Data Hallucination With Semantic Relation Guidance for Few-Shot Image Recognition","authors":"Hefeng Wu;Guangzhi Ye;Ziyang Zhou;Ling Tian;Qing Wang;Liang Lin","doi":"10.1109/TMM.2024.3453055","DOIUrl":"10.1109/TMM.2024.3453055","url":null,"abstract":"Learning to recognize novel concepts from just a few image samples is very challenging as the learned model is easily overfitted on the few data and results in poor generalizability. One promising but underexplored solution is to compensate for the novel classes by generating plausible samples. However, most existing works of this line exploit visual information only, rendering the generated data easy to be distracted by some challenging factors contained in the few available samples. Being aware of the semantic information in the textual modality that reflects human concepts, this work proposes a novel framework that exploits semantic relations to guide dual-view data hallucination for few-shot image recognition. The proposed framework enables generating more diverse and reasonable data samples for novel classes through effective information transfer from base classes. Specifically, an instance-view data hallucination module hallucinates each sample of a novel class to generate new data by employing local semantic correlated attention and global semantic feature fusion derived from base classes. Meanwhile, a prototype-view data hallucination module exploits semantic-aware measure to estimate the prototype of a novel class and the associated distribution from the few samples, which thereby harvests the prototype as a more stable sample and enables resampling a large number of samples. We conduct extensive experiments and comparisons with state-of-the-art methods on several popular few-shot benchmarks to verify the effectiveness of the proposed framework.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"11302-11315"},"PeriodicalIF":8.4,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142178587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IEIRNet: Inconsistency Exploiting Based Identity Rectification for Face Forgery Detection IEIRNet:基于不一致性开发的人脸伪造检测身份校正技术
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-09-02 DOI: 10.1109/TMM.2024.3453066
Mingqi Fang;Lingyun Yu;Yun Song;Yongdong Zhang;Hongtao Xie
{"title":"IEIRNet: Inconsistency Exploiting Based Identity Rectification for Face Forgery Detection","authors":"Mingqi Fang;Lingyun Yu;Yun Song;Yongdong Zhang;Hongtao Xie","doi":"10.1109/TMM.2024.3453066","DOIUrl":"10.1109/TMM.2024.3453066","url":null,"abstract":"Face forgery detection has attracted much attention due to the ever-increasing social concerns caused by facial manipulation techniques. Recently, identity-based detection methods have made considerable progress, which is especially suitable in the celebrity protection scenario. However, they still suffer from two main limitations: (a) generic identity extractor is not specifically designed for forgery detection, leading to nonnegligible \u0000<italic>Identity Representation Bias</i>\u0000 to forged images. (b) existing methods only analyze the identity representation of each image individually, but ignores the query-reference interaction for inconsistency exploiting. To address these issues, a novel \u0000<italic>Inconsistency Exploiting based Identity Rectification Network</i>\u0000 (IEIRNet) is proposed in this paper. Firstly, for the identity bias rectification, the IEIRNet follows an effective two-branches structure. Besides the \u0000<italic>Generic Identity Extractor</i>\u0000 (GIE) branch, an essential \u0000<italic>Bias Diminishing Module</i>\u0000 (BDM) branch is proposed to eliminate the identity bias through a novel \u0000<italic>Attention-based Bias Rectification</i>\u0000 (ABR) component, accordingly acquiring the ultimate discriminative identity representation. Secondly, for query-reference inconsistency exploiting, an \u0000<italic>Inconsistency Exploiting Module</i>\u0000 (IEM) is applied in IEIRNet to comprehensively exploit the inconsistency clues from both spatial and channel perspectives. In the spatial aspect, an innovative region-aware kernel is derived to activate the local region inconsistency with deep spatial interaction. Afterward in the channel aspect, a coattention mechanism is utilized to model the channel interaction meticulously, and accordingly highlight the channel-wise inconsistency with adaptive weight assignment and channel-wise dropout. Our IEIRNet has shown effectiveness and superiority in various generalization and robustness experiments.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"11232-11245"},"PeriodicalIF":8.4,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142178722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pixel-Learnable 3DLUT With Saturation-Aware Compensation for Image Enhancement 具有饱和度补偿功能的像素可学习 3DLUT 图像增强技术
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-09-02 DOI: 10.1109/TMM.2024.3453064
Jing Liu;Qingying Li;Xiongkuo Min;Yuting Su;Guangtao Zhai;Xiaokang Yang
{"title":"Pixel-Learnable 3DLUT With Saturation-Aware Compensation for Image Enhancement","authors":"Jing Liu;Qingying Li;Xiongkuo Min;Yuting Su;Guangtao Zhai;Xiaokang Yang","doi":"10.1109/TMM.2024.3453064","DOIUrl":"10.1109/TMM.2024.3453064","url":null,"abstract":"The 3D Lookup Table (3DLUT)-based methods are gaining popularity due to their satisfactory and stable performance in achieving automatic and adaptive real time image enhancement. In this paper, we present a new solution to the intractability in handling continuous color transformations of 3DLUT due to the lookup via three independent color channel coordinates in RGB space. Inspired by the inherent merits of the HSV color space, we separately enhance image intensity and color composition. The Transformer-based Pixel-Learnable 3D Lookup Table is proposed to undermine contouring artifacts, which enhances images in a pixel-wise manner with non-local information to emphasize the diverse spatially variant context. In addition, noticing the underestimation of composition color component, we develop the Saturation-Aware Compensation (SAC) module to enhance the under-saturated region determined by an adaptive SA map with Saturation-Interaction block, achieving well balance between preserving details and color rendition. Our approach can be applied to image retouching and tone mapping tasks with fairly good generality, especially in restoring localized regions with weak visibility. The performance in both theoretical analysis and comparative experiments manifests that the proposed solution is effective and robust.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"11219-11231"},"PeriodicalIF":8.4,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142178719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
End-to-End Image Colorization With Multiscale Pyramid Transformer 利用多尺度金字塔变换器实现端到端图像着色
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-09-02 DOI: 10.1109/TMM.2024.3453035
Tongtong Zhao;Gehui Li;Shanshan Zhao
{"title":"End-to-End Image Colorization With Multiscale Pyramid Transformer","authors":"Tongtong Zhao;Gehui Li;Shanshan Zhao","doi":"10.1109/TMM.2024.3453035","DOIUrl":"10.1109/TMM.2024.3453035","url":null,"abstract":"Image colorization is a challenging task due to its ill-posed and multimodal nature, leading to unsatisfactory results in traditional approaches that rely on reference images or user guides. Although deep learning-based methods have been proposed, they may not be sufficient due to the lack of semantic understanding. To overcome this limitation, we present an innovative end-to-end automatic colorization method that does not require any color reference images and achieves superior quantitative and qualitative results compared to state-of-the-art methods. Our approach incorporates a Multiscale Pyramid Transformer that captures both local and global contextual information and a novel attention module called Dual-Attention, which replaces the traditional Window Attention and Channel Attention with faster and lighter Separable Dilated Attention and Factorized Channel Attention. Additionally, we introduce a new color decoder called Color-Attention, which learns colorization patterns from grayscale images and color images of the current training set, resulting in improved generalizability and eliminating the need for constructing color priors. Experimental results demonstrate the effectiveness of our approach in various benchmark datasets, including high-level computer vision tasks such as classification, segmentation, and detection. Our method offers robustness, generalization ability, and improved colorization quality, making it a valuable contribution to the field of image colorization.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"11332-11344"},"PeriodicalIF":8.4,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142178721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Coarse-to-Fine Target Detection for HFSWR With Spatial-Frequency Analysis and Subnet Structure 利用空间频率分析和子网结构进行从粗到细的 HFSWR 目标检测
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-09-02 DOI: 10.1109/TMM.2024.3453044
Wandong Zhang;Yimin Yang;Tianlong Liu
{"title":"Coarse-to-Fine Target Detection for HFSWR With Spatial-Frequency Analysis and Subnet Structure","authors":"Wandong Zhang;Yimin Yang;Tianlong Liu","doi":"10.1109/TMM.2024.3453044","DOIUrl":"10.1109/TMM.2024.3453044","url":null,"abstract":"High-frequency surface wave radar (HFSWR) is a powerful tool for ship detection and surveillance. blackHowever, the use of pre-trained deep learning (DL) networks for ship detection is challenging due to the limited training samples in HFSWR and the substantial differences between remote sensing images and everyday images. To tackle these issues, this paper proposes a coarse-to-fine target detection approach that combines traditional methods with DL, resulting in improved performance. The contributions of this work include: 1) a two-stage learning pipeline that integrates spatial-frequency analysis (SFA) with subnet-based neural networks, 2) an automatic linear thresholding algorithm for plausible target region (PTR) detection, and 3) a robust subnet neural network for fine target detection. The advantage of using SFA and subnet network is that the SFA reduces the need for extensive training data, while the subnet neural network excels at localizing ships even with limited training data. Experimental results on the HFSWR-RD dataset affirm the model's superior performance compared to rival algorithms.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"11290-11301"},"PeriodicalIF":8.4,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142223657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信