IEEE Transactions on Multimedia最新文献

筛选
英文 中文
Accelerated Lloyd's Method for Resampling 3D Point Clouds 重采样三维点云的加速劳埃德方法
IF 7.3 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-05-27 DOI: 10.1109/tmm.2024.3405664
Yanyang Xiao, Tieyi Zhang, Juan Cao, Zhonggui Chen
{"title":"Accelerated Lloyd's Method for Resampling 3D Point Clouds","authors":"Yanyang Xiao, Tieyi Zhang, Juan Cao, Zhonggui Chen","doi":"10.1109/tmm.2024.3405664","DOIUrl":"https://doi.org/10.1109/tmm.2024.3405664","url":null,"abstract":"","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"2016 1","pages":""},"PeriodicalIF":7.3,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bipartite Graph-Based Projected Clustering With Local Region Guidance for Hyperspectral Imagery 基于双方图的投影聚类与高光谱图像的局部区域引导
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-04-30 DOI: 10.1109/TMM.2024.3394975
Yongshan Zhang;Guozhu Jiang;Zhihua Cai;Yicong Zhou
{"title":"Bipartite Graph-Based Projected Clustering With Local Region Guidance for Hyperspectral Imagery","authors":"Yongshan Zhang;Guozhu Jiang;Zhihua Cai;Yicong Zhou","doi":"10.1109/TMM.2024.3394975","DOIUrl":"10.1109/TMM.2024.3394975","url":null,"abstract":"Hyperspectral image (HSI) clustering is challenging to divide all pixels into different clusters because of the absent labels, large spectral variability and complex spatial distribution. Anchor strategy provides an attractive solution to the computational bottleneck of graph-based clustering for large HSIs. However, most existing methods require separated learning procedures and ignore noisy as well as spatial information. In this paper, we propose a bipartite graph-based projected clustering (BGPC) method with local region guidance for HSI data. To take full advantage of spatial information, HSI denoising to alleviate noise interference and anchor initialization to construct bipartite graph are conducted within each generated superpixel. With the denoised pixels and initial anchors, projection learning and structured bipartite graph learning are simultaneously performed in a one-step learning model with connectivity constraint to directly provide clustering results. An alternating optimization algorithm is devised to solve the formulated model. The advantage of BGPC is the joint learning of projection and bipartite graph with local region guidance to exploit spatial information and linear time complexity to lessen computational burden. Extensive experiments demonstrate the superiority of the proposed BGPC over the state-of-the-art HSI clustering methods.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"9551-9563"},"PeriodicalIF":8.4,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140831409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Each Performs Its Functions: Task Decomposition and Feature Assignment for Audio-Visual Segmentation 各司其职:音视频分割的任务分解和特征分配
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-04-30 DOI: 10.1109/TMM.2024.3394682
Sen Xu;Shikui Wei;Tao Ruan;Lixin Liao;Yao Zhao
{"title":"Each Performs Its Functions: Task Decomposition and Feature Assignment for Audio-Visual Segmentation","authors":"Sen Xu;Shikui Wei;Tao Ruan;Lixin Liao;Yao Zhao","doi":"10.1109/TMM.2024.3394682","DOIUrl":"10.1109/TMM.2024.3394682","url":null,"abstract":"Audio-visual segmentation (AVS) aims to segment the object instances that produce sound at the time of the video frames. Existing related solutions focus on designing cross-modal interaction mechanisms, which try to learn audio-visual correlations and simultaneously segment objects. Despite effectiveness, the close-coupling network structures become increasingly complex and hard to analyze. To address these problems, we propose a simple but effective method, ‘Each \u0000<underline>P</u>\u0000erforms \u0000<underline>I</u>\u0000ts \u0000<underline>F</u>\u0000unctions (PIF),’ which focuses on task decomposition and feature assignment. Inspired by human sensory experiences, PIF decouples AVS into two subtasks, correlation learning, and segmentation refinement, via two branches. Correlation learning aims to learn the correspondence between sound and visible individuals and provide the positional prior. Segmentation refinement focuses on fine segmentation. Then we assign different level features to perform the appropriate duties, i.e., using deep features for cross-modal interaction due to their semantic advantages; using rich textures of shallow features to improve segmentation results. Moreover, we propose the recurrent collaboration block to enhance interbranch communication. Experimental results on AVSBench show that our method outperforms related state-of-the-art methods by a large margin (e.g., +6.0% mIoU and +7.6% F-score on the Multi-Source subset). In addition, by purposely boosting subtasks' performance, our approach can serve as a strong baseline for audio-visual segmentation.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"9489-9498"},"PeriodicalIF":8.4,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140831411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neighborhood-Aware Mutual Information Maximization for Source-Free Domain Adaptation 无源域自适应的邻域感知互信息最大化
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-04-30 DOI: 10.1109/TMM.2024.3394971
Lin Zhang;Yifan Wang;Ran Song;Mingxin Zhang;Xiaolei Li;Wei Zhang
{"title":"Neighborhood-Aware Mutual Information Maximization for Source-Free Domain Adaptation","authors":"Lin Zhang;Yifan Wang;Ran Song;Mingxin Zhang;Xiaolei Li;Wei Zhang","doi":"10.1109/TMM.2024.3394971","DOIUrl":"10.1109/TMM.2024.3394971","url":null,"abstract":"Recently, the source-free domain adaptation (SFDA) problem has attracted much attention, where the pre-trained model for the source domain is adapted to the target domain in the absence of source data. However, due to domain shift, the negative alignment usually exists between samples from the same class, which may lower intra-class feature similarity. To address this issue, we present a self-supervised representation learning strategy for SFDA, named as neighborhood-aware mutual information (NAMI), which maximizes the mutual information (MI) between the representations of target samples and their corresponding neighbors. Moreover, we theoretically demonstrate that NAMI can be decomposed into a weighted sum of local MI, which suggests that the weighted terms can better estimate NAMI. To this end, we introduce neighborhood consensus score over the set of weakly and strongly augmented views and point-wise density based on neighborhood, both of which determine the weights of local MI for NAMI by leveraging the neighborhood information of samples. The proposed method can significantly handle domain shift and adaptively reduce the noise in the neighborhood of each target sample. In combination with the consistency loss over views, NAMI leads to consistent improvement over existing state-of-the-art methods on three popular SFDA benchmarks.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"9564-9574"},"PeriodicalIF":8.4,"publicationDate":"2024-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140831488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semantic-Enhanced Proxy-Guided Hashing for Long-Tailed Image Retrieval 针对长尾图像检索的语义增强型路径引导哈希算法
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-04-29 DOI: 10.1109/TMM.2024.3394684
Hongtao Xie;Yan Jiang;Lei Zhang;Pandeng Li;Dongming Zhang;Yongdong Zhang
{"title":"Semantic-Enhanced Proxy-Guided Hashing for Long-Tailed Image Retrieval","authors":"Hongtao Xie;Yan Jiang;Lei Zhang;Pandeng Li;Dongming Zhang;Yongdong Zhang","doi":"10.1109/TMM.2024.3394684","DOIUrl":"10.1109/TMM.2024.3394684","url":null,"abstract":"Hashing has been studied extensively for large-scale image retrieval due to its efficient computation and storage. Deep hashing methods typically train models with category-balanced data and suffer from a serious performance deterioration when dealing with long-tailed training samples. Recently, several long-tailed hashing methods focus on this newly emerging field for practical purpose. However, existing methods still face challenges that fixed category centers with limited semantic information cannot effectively improve the discriminative ability of tail-category hash codes. To tackle the issue, we propose a novel method called Semantic-enhanced Proxy-guided Hashing in this paper. We leverage two sets of learnable category proxies in the feature space and the Hamming space respectively, which can describe category semantics by getting updated continuously along with the whole model via back-propagation. Based on this, we introduce the Mahalanobis distance metric to characterize relationships accurately and enhance the semantic representation of both proxies and samples concurrently, improving the hash learning process. Moreover, we capture the multilateral correlations between proxies and samples in the feature space and extend a hypergraph neural network to transfer semantic knowledge from proxies to samples in the Hamming space. Extensive experiments show that our method achieves the state-of-the-art performance and surpasses existing methods by 1.47%–7.56% MAP on long-tailed benchmarks, demonstrating the superiority of learnable category proxies and the effectiveness of our proposed learning algorithm for long-tailed hashing.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"9499-9514"},"PeriodicalIF":8.4,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140831397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking 用于多目标跟踪的单镜头和多镜头特征学习
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-04-29 DOI: 10.1109/TMM.2024.3394683
Yizhe Li;Sanping Zhou;Zheng Qin;Le Wang;Jinjun Wang;Nanning Zheng
{"title":"Single-Shot and Multi-Shot Feature Learning for Multi-Object Tracking","authors":"Yizhe Li;Sanping Zhou;Zheng Qin;Le Wang;Jinjun Wang;Nanning Zheng","doi":"10.1109/TMM.2024.3394683","DOIUrl":"10.1109/TMM.2024.3394683","url":null,"abstract":"Multi-Object Tracking (MOT) remains a vital component of intelligent video analysis, which aims to locate targets and maintain a consistent identity for each target throughout a video sequence. Existing works usually learn a discriminative feature representation, such as motion and appearance, to associate the detections across frames, which are easily affected by mutual occlusion and background clutter in practice. In this paper, we propose a simple yet effective two-stage feature learning paradigm to jointly learn single-shot and multi-shot features for different targets, so as to achieve robust data association in the tracking process. For the detections without being associated, we design a novel single-shot feature learning module to extract discriminative features of each detection, which can efficiently associate targets between adjacent frames. For the tracklets being lost several frames, we design a novel multi-shot feature learning module to extract discriminative features of each tracklet, which can accurately refind these lost targets after a long period. Once equipped with a simple data association logic, the resulting VisualTracker can perform robust MOT based on the single-shot and multi-shot feature representations. Extensive experimental results demonstrate that our method has achieved significant improvements on MOT17 and MOT20 datasets while reaching state-of-the-art performance on DanceTrack dataset.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"9515-9526"},"PeriodicalIF":8.4,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140831485","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Part-Aware Correlation Networks for Few-Shot Learning 用于少量学习的部件感知相关网络
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-04-29 DOI: 10.1109/TMM.2024.3394681
Ruiheng Zhang;Jinyu Tan;Zhe Cao;Lixin Xu;Yumeng Liu;Lingyu Si;Fuchun Sun
{"title":"Part-Aware Correlation Networks for Few-Shot Learning","authors":"Ruiheng Zhang;Jinyu Tan;Zhe Cao;Lixin Xu;Yumeng Liu;Lingyu Si;Fuchun Sun","doi":"10.1109/TMM.2024.3394681","DOIUrl":"10.1109/TMM.2024.3394681","url":null,"abstract":"Few-shot learning brings the machine close to human thinking which enables fast learning with limited samples. Recent work considers local features to achieve contextual semantic complementation, while they are merely coarsened feature observations that can only extract insignificant label correlations. On the contrary, partial properties of few-shot examples significantly draw the implicit feature observations that can reveal the underlying label correlation of rare label classification. To fully explore the correlation between labels and partial features, this paper proposes a Part-Aware Correlation Network (PACNet) based on Partial Representation (PR) and Semantic Covariance Matrix (SCM). Specifically, we develop a partial representing module of an object that eliminates object-independent information and allows the model to focus on more distinctive parts. Furthermore, a semantic covariance measure function is redefined as a way to learn the semantic relationships of partial representations and to compute the partial similarity between the query sample and the support set. Experiments on three benchmark datasets consistently show that the proposed method outperforms the state-of-the-art counterparts, \u0000<italic>e.g.</i>\u0000, on the PartImageNet dataset, the performance gains of up to 12% and 5.9% are observed for the 5-way 1-shot and 5-way 5-shot settings, respectively.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"9527-9538"},"PeriodicalIF":8.4,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140842032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Domain-Oriented Knowledge Transfer for Cross-Domain Recommendation 跨领域推荐的领域导向知识转移
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-04-29 DOI: 10.1109/TMM.2024.3394686
Guoshuai Zhao;Xiaolong Zhang;Hao Tang;Jialie Shen;Xueming Qian
{"title":"Domain-Oriented Knowledge Transfer for Cross-Domain Recommendation","authors":"Guoshuai Zhao;Xiaolong Zhang;Hao Tang;Jialie Shen;Xueming Qian","doi":"10.1109/TMM.2024.3394686","DOIUrl":"10.1109/TMM.2024.3394686","url":null,"abstract":"Cross-Domain Recommendation (CDR) aims to alleviate the cold-start problem by transferring knowledge from a data-rich domain (source domain) to a data-sparse domain (target domain), where knowledge needs to be transferred through a bridge connecting the two domains. Therefore, constructing a bridge connecting the two domains is fundamental for enabling cross-domain recommendation. However, existing CDR methods often overlook the valuable of natural relationships between items in connecting the two domains. To address this issue, we propose DKTCDR: a Domain-oriented Knowledge Transfer method for Cross-Domain Recommendation. In DKTCDR, We leverages the rich relationships between items in a cross-domain knowledge graph as bridges to facilitate both intra- and inter-domain knowledge transfer. Additionally, we design a cross-domain knowledge transfer strategy to enhance inter-domain knowledge transfer. Furthermore, we integrate the semantic modality information of items with the knowledge graph modality information to enhance item modeling. To support our investigation, we construct two high-quality cross-domain recommendation datasets, each containing a cross-domain knowledge graph. Our experimental results on these datasets validate the effectiveness of our proposed method. Source code is available at \u0000<uri>https://github.com/zxxxl123/DKTCDR</uri>\u0000.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"9539-9550"},"PeriodicalIF":8.4,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140831423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Group Multi-View Transformer for 3D Shape Analysis With Spatial Encoding 利用空间编码进行三维形状分析的群组多视图变换器
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-04-29 DOI: 10.1109/TMM.2024.3394731
Lixiang Xu;Qingzhe Cui;Richang Hong;Wei Xu;Enhong Chen;Xin Yuan;Chenglong Li;Yuanyan Tang
{"title":"Group Multi-View Transformer for 3D Shape Analysis With Spatial Encoding","authors":"Lixiang Xu;Qingzhe Cui;Richang Hong;Wei Xu;Enhong Chen;Xin Yuan;Chenglong Li;Yuanyan Tang","doi":"10.1109/TMM.2024.3394731","DOIUrl":"10.1109/TMM.2024.3394731","url":null,"abstract":"In recent years, the results of view-based 3D shape recognition methods have saturated, and models with excellent performance cannot be deployed on memory-limited devices due to their huge size of parameters. To address this problem, we introduce a compression method based on knowledge distillation for this field, which largely reduces the number of parameters while preserving model performance as much as possible. Specifically, to enhance the capabilities of smaller models, we design a high-performing large model called Group Multi-view Vision Transformer (GMViT). In GMViT, the view-level ViT first establishes relationships between view-level features. Additionally, to capture deeper features, we employ the grouping module to enhance view-level features into group-level features. Finally, the group-level ViT aggregates group-level features into complete, well-formed 3D shape descriptors. Notably, in both ViTs, we introduce spatial encoding of camera coordinates as innovative position embeddings. Furthermore, we propose two compressed versions based on GMViT, namely GMViT-simple and GMViT-mini. To enhance the training effectiveness of the small models, we introduce a knowledge distillation method throughout the GMViT process, where the key outputs of each GMViT component serve as distillation targets. Extensive experiments demonstrate the efficacy of the proposed method. The large model GMViT achieves excellent 3D classification and retrieval results on the benchmark datasets ModelNet, ShapeNetCore55, and MCB. The smaller models, GMViT-simple and GMViT-mini, reduce the parameter size by 8 and 17.6 times, respectively, and improve shape recognition speed by 1.5 times on average, while preserving at least 90% of the recognition performance.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"9450-9463"},"PeriodicalIF":8.4,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140831405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards High-Quality Photorealistic Image Style Transfer 实现高质量逼真图像风格转移
IF 8.4 1区 计算机科学
IEEE Transactions on Multimedia Pub Date : 2024-04-29 DOI: 10.1109/TMM.2024.3394733
Hong Ding;Haimin Zhang;Gang Fu;Caoqing Jiang;Fei Luo;Chunxia Xiao;Min Xu
{"title":"Towards High-Quality Photorealistic Image Style Transfer","authors":"Hong Ding;Haimin Zhang;Gang Fu;Caoqing Jiang;Fei Luo;Chunxia Xiao;Min Xu","doi":"10.1109/TMM.2024.3394733","DOIUrl":"10.1109/TMM.2024.3394733","url":null,"abstract":"Preserving important textures of the content image and achieving prominent style transfer results remains a challenge in the field of image style transfer. This challenge arises from the entanglement between color and texture during the style transfer process. To address this challenge, we propose an end-to-end network that incorporates adaptive weighted least squares (AWLS) filter, iterative least squares (ILS) filter, and channel separation. Given a content image (\u0000<inline-formula><tex-math>$mathcal {C}$</tex-math></inline-formula>\u0000) and a reference style image (\u0000<inline-formula><tex-math>$mathcal {S}$</tex-math></inline-formula>\u0000), we begin by separating the RGB channels and utilizing ILS filter to decompose them into structure and texture layers. We then perform style transfer on the structural layers using WCT\u0000<inline-formula><tex-math>$^{2}$</tex-math></inline-formula>\u0000 (incorporating wavelet pooling and unpooling techniques for whitening and coloring transforms) in the R, G, and B channels, respectively. We address the texture distortion caused by WCT\u0000<inline-formula><tex-math>$^{2}$</tex-math></inline-formula>\u0000 with a texture enhancing (TE) module in the structural layer. Furthermore, we propose an estimating and compensating for the structure loss (ECSL) module. In the ECSL module, with the AWLS filter and the ILS filter, we estimate the texture loss caused by TE, convert the loss of the structural layer to the loss of the texture layer, and compensate for the loss in the texture layer. The final structural layer and the texture layer are merged into the channel style transfer results in the separated R, G, and B channels into the final style transfer result. Thereby, this enables a more complete texture preservation and a significant style transfer process. To evaluate our method, we utilize quantitative experiments using various metrics, including NIQE, AG, SSIM, PSNR, and a user study. The experimental results demonstrate the superiority of our approach over the previous state-of-the-art methods.","PeriodicalId":13273,"journal":{"name":"IEEE Transactions on Multimedia","volume":"26 ","pages":"9892-9905"},"PeriodicalIF":8.4,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140831559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信