ACM Transactions on Multimedia Computing Communications and Applications最新文献_第6页

A Quality-Aware and Obfuscation-Based Data Collection Scheme for Cyber-Physical Metaverse Systems 网络物理元宇宙系统的质量意识和基于混淆的数据收集方案

IF 5.1 3区计算机科学

ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-04-16 DOI: 10.1145/3659582

Jianheng Tang, Kejia Fan, Wenjie Yin, Shihao Yang, Yajiang Huang, Anfeng Liu, Neal N. Xiong, Mianxiong Dong, Tian Wang, Shaobo Zhang

{"title":"A Quality-Aware and Obfuscation-Based Data Collection Scheme for Cyber-Physical Metaverse Systems","authors":"Jianheng Tang, Kejia Fan, Wenjie Yin, Shihao Yang, Yajiang Huang, Anfeng Liu, Neal N. Xiong, Mianxiong Dong, Tian Wang, Shaobo Zhang","doi":"10.1145/3659582","DOIUrl":"https://doi.org/10.1145/3659582","url":null,"abstract":"In pursuit of an immersive virtual experience within the Cyber-Physical Metaverse Systems (CPMS), the construction of Avatars often requires a significant amount of real-world data. Mobile Crowd Sensing (MCS) has emerged as an efficient method for collecting data for CPMS. While progress has been made in protecting the privacy of workers, little attention has been given to safeguarding task privacy, potentially exposing the intentions of applications and posing risks to the development of the Metaverse. Additionally, existing privacy protection schemes hinder the exchange of information among entities, inadvertently compromising the quality of the collected data. To this end, we propose a Quality-aware and Obfuscation-based Task Privacy-Preserving (QOTPP) scheme, which protects task privacy and enhances data quality without third-party involvement. The QOTPP scheme initially employs the insight of “showing the fake, and hiding the real” by employing differential privacy techniques to create fake tasks and conceal genuine ones. Additionally, we introduce a two-tier truth discovery mechanism using Deep Matrix Factorization (DMF) to efficiently identify high-quality workers. Furthermore, we propose a Combinatorial Multi-Armed Bandit (CMAB)-based worker incentive and selection mechanism to improve the quality of data collection. Theoretical analysis confirms that our QOTPP scheme satisfies essential properties such as truthfulness, individual rationality, and ϵ-differential privacy. Extensive simulation experiments validate the state-of-the-art performance achieved by QOTPP.","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"63 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140590527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Rank-based Hashing for Effective and Efficient Nearest Neighbor Search for Image Retrieval 基于等级散列的高效图像检索近邻搜索

IF 5.1 3区计算机科学

ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-04-16 DOI: 10.1145/3659580

Vinicius Sato Kawai, Lucas Pascotti Valem, Alexandro Baldassin, Edson Borin, Daniel Carlos Guimarães Pedronette, Longin Jan Latecki

{"title":"Rank-based Hashing for Effective and Efficient Nearest Neighbor Search for Image Retrieval","authors":"Vinicius Sato Kawai, Lucas Pascotti Valem, Alexandro Baldassin, Edson Borin, Daniel Carlos Guimarães Pedronette, Longin Jan Latecki","doi":"10.1145/3659580","DOIUrl":"https://doi.org/10.1145/3659580","url":null,"abstract":"The large and growing amount of digital data creates a pressing need for approaches capable of indexing and retrieving multimedia content. A traditional and fundamental challenge consists of effectively and efficiently performing nearest-neighbor searches. After decades of research, several different methods are available, including trees, hashing, and graph-based approaches. Most of the current methods exploit learning to hash approaches based on deep learning. In spite of effective results and compact codes obtained, such methods often require a significant amount of labeled data for training. Unsupervised approaches also rely on expensive training procedures usually based on a huge amount of data. In this work, we propose an unsupervised data-independent approach for nearest neighbor searches, which can be used with different features, including deep features trained by transfer learning. The method uses a rank-based formulation and exploits a hashing approach for efficient ranked list computation at query time. A comprehensive experimental evaluation was conducted on 7 public datasets, considering deep features based on CNNs and Transformers. Both effectiveness and efficiency aspects were evaluated. The proposed approach achieves remarkable results in comparison to traditional and state-of-the-art methods. Hence, it is an attractive and innovative solution, especially when costly training procedures need to be avoided.","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"17 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140590537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Leveraging Frame- and Feature-Level Progressive Augmentation for Semi-supervised Action Recognition 利用帧和特征级渐进增强技术实现半监督式动作识别

IF 5.1 3区计算机科学

ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-04-11 DOI: 10.1145/3655025

Zhewei Tu, Xiangbo Shu, Peng Huang, Rui Yan, Zhenxing Liu, Jiachao Zhang

{"title":"Leveraging Frame- and Feature-Level Progressive Augmentation for Semi-supervised Action Recognition","authors":"Zhewei Tu, Xiangbo Shu, Peng Huang, Rui Yan, Zhenxing Liu, Jiachao Zhang","doi":"10.1145/3655025","DOIUrl":"https://doi.org/10.1145/3655025","url":null,"abstract":"Semi-supervised action recognition is a challenging yet prospective task due to its low reliance on costly labeled videos. One high-profile solution is to explore frame-level weak/strong augmentations for learning abundant representations, inspired by the FixMatch framework dominating the semi-supervised image classification task. However, such a solution mainly brings perturbations in terms of texture and scale, leading to the limitation in learning action representations in videos with spatiotemporal redundancy and complexity. Therefore, we revisit the creative trick of weak/strong augmentations in FixMatch, and then propose a novel Frame- and Feature-level augmentation FixMatch (dubbed as F2-FixMatch) framework to learn more abundant action representations for being robust to complex and dynamic video scenarios. Specifically, we design a new Progressive Augmentation (P-Aug) mechanism that implements the weak/strong augmentations first at the frame level, and further implements the perturbation at the feature level, to obtain abundant four types of augmented features in broader perturbation spaces. Moreover, we present an evolved Multihead Pseudo-Labeling (MPL) scheme to promote the consistency of features across different augmented versions based on the pseudo labels. We conduct extensive experiments on several public datasets to demonstrate that our F2-FixMatch achieves the performance gain compared with current state-of-the-art methods. The source codes of F2-FixMatch are publicly available at https://github.com/zwtu/F2FixMatch.","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"36 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140602711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

GANs in the Panorama of Synthetic Data Generation Methods: Application and Evaluation: Enhancing Fake News Detection with GAN-Generated Synthetic Data: ACM Transactions on Multimedia Computing, Communications, and Applications: Vol 0, No ja 合成数据生成方法全景中的 GANs：应用与评估利用 GAN 生成的合成数据加强假新闻检测》：ACM Transactions on Multimedia Computing, Communications, and Applications：Vol 0, No ja

IF 5.1 3区计算机科学

ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-04-10 DOI: 10.1145/3657294

Bruno Vaz, Álvaro Figueira

{"title":"GANs in the Panorama of Synthetic Data Generation Methods: Application and Evaluation: Enhancing Fake News Detection with GAN-Generated Synthetic Data: ACM Transactions on Multimedia Computing, Communications, and Applications: Vol 0, No ja","authors":"Bruno Vaz, Álvaro Figueira","doi":"10.1145/3657294","DOIUrl":"https://doi.org/10.1145/3657294","url":null,"abstract":"This paper focuses on the creation and evaluation of synthetic data to address the challenges of imbalanced datasets in machine learning applications (ML), using fake news detection as a case study. We conducted a thorough literature review on generative adversarial networks (GANs) for tabular data, synthetic data generation methods, and synthetic data quality assessment. By augmenting a public news dataset with synthetic data generated by different GAN architectures, we demonstrate the potential of synthetic data to improve ML models’ performance in fake news detection. Our results show a significant improvement in classification performance, especially in the underrepresented class. We also modify and extend a data usage approach to evaluate the quality of synthetic data and investigate the relationship between synthetic data quality and data augmentation performance in classification tasks. We found a positive correlation between synthetic data quality and performance in the underrepresented class, highlighting the importance of high-quality synthetic data for effective data augmentation.","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"215 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140590525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DBGAN: Dual Branch Generative Adversarial Network for Multi-modal MRI Translation DBGAN：用于多模态核磁共振成像翻译的双分支生成对抗网络

IF 5.1 3区计算机科学

ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-04-10 DOI: 10.1145/3657298

Jun Lyu, Shouang Yan, M. Shamim Hossain

{"title":"DBGAN: Dual Branch Generative Adversarial Network for Multi-modal MRI Translation","authors":"Jun Lyu, Shouang Yan, M. Shamim Hossain","doi":"10.1145/3657298","DOIUrl":"https://doi.org/10.1145/3657298","url":null,"abstract":"Existing Magnetic resonance imaging (MRI) translation models rely on Generative Adversarial Networks, primarily employing simple convolutional neural networks. Unfortunately, these networks struggle to capture global representations and contextual relationships within MRI images. While the advent of Transformers enables capturing long-range feature dependencies, they often compromise the preservation of local feature details. To address these limitations and enhance both local and global representations, we introduce a novel Dual-Branch Generative Adversarial Network (DBGAN). In this framework, the Transformer branch comprises sparse attention blocks and dense self-attention blocks, allowing for a wider receptive field while simultaneously capturing local and global information. The CNN branch, built with integrated residual convolutional layers, enhances local modeling capabilities. Additionally, we propose a fusion module that cleverly integrates features extracted from both branches. Extensive experimentation on two public datasets and one clinical dataset validates significant performance improvements with DBGAN. On Brats2018, it achieves a 10%improvement in MAE, 3.2% in PSNR, and 4.8% in SSIM for image generation tasks compablack to RegGAN. Notably, the generated MRIs receive positive feedback from radiologists, underscoring the potential of our proposed method as a valuable tool in clinical settings.","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"68 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140590526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

SigFormer: Sparse Signal-Guided Transformer for Multi-Modal Action Segmentation SigFormer：用于多模态动作分割的稀疏信号引导变换器

IF 5.1 3区计算机科学

ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-04-10 DOI: 10.1145/3657296

Qi Liu, Xinchen Liu, Kun Liu, Xiaoyan Gu, Wu Liu

{"title":"SigFormer: Sparse Signal-Guided Transformer for Multi-Modal Action Segmentation","authors":"Qi Liu, Xinchen Liu, Kun Liu, Xiaoyan Gu, Wu Liu","doi":"10.1145/3657296","DOIUrl":"https://doi.org/10.1145/3657296","url":null,"abstract":"Multi-modal human action segmentation is a critical and challenging task with a wide range of applications. Nowadays, the majority of approaches concentrate on the fusion of dense signals (i.e., RGB, optical flow, and depth maps). However, the potential contributions of sparse IoT sensor signals, which can be crucial for achieving accurate recognition, have not been fully explored. To make up for this, we introduce a Sparse signal-guided Transformer (SigFormer) to combine both dense and sparse signals. We employ mask attention to fuse localized features by constraining cross-attention within the regions where sparse signals are valid. However, since sparse signals are discrete, they lack sufficient information about the temporal action boundaries. Therefore, in SigFormer, we propose to emphasize the boundary information at two stages to alleviate this problem. In the first feature extraction stage, we introduce an intermediate bottleneck module to jointly learn both category and boundary features of each dense modality through the inner loss functions. After the fusion of dense modalities and sparse signals, we then devise a two-branch architecture that explicitly models the interrelationship between action category and temporal boundary. Experimental results demonstrate that SigFormer outperforms the state-of-the-art approaches on a multi-modal action segmentation dataset from real industrial environments, reaching an outstanding F1 score of 0.958. The codes and pre-trained models have been available at https://github.com/LIUQI-creat/SigFormer.","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"66 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140590586","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Psychology-Guided Environment Aware Network for Discovering Social Interaction Groups from Videos 从视频中发现社会互动群体的心理引导型环境感知网络

IF 5.1 3区计算机科学

ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-04-09 DOI: 10.1145/3657295

Jiaqi Yu, Jinhai Yang, Hua Yang, Renjie Pan, Pingrui Lai, Guangtao Zhai

{"title":"Psychology-Guided Environment Aware Network for Discovering Social Interaction Groups from Videos","authors":"Jiaqi Yu, Jinhai Yang, Hua Yang, Renjie Pan, Pingrui Lai, Guangtao Zhai","doi":"10.1145/3657295","DOIUrl":"https://doi.org/10.1145/3657295","url":null,"abstract":"Social interaction is a common phenomenon in human societies. Different from discovering groups based on the similarity of individuals’ actions, social interaction focuses more on the mutual influence between people. Although people can easily judge whether or not there are social interactions in a real-world scene, it is difficult for an intelligent system to discover social interactions. Initiating and concluding social interactions are greatly influenced by an individual’s social cognition and the surrounding environment, which are closely related to psychology. Thus, converting the psychological factors that impact social interactions into quantifiable visual representations and creating a model for interaction relationships poses a significant challenge. To this end, we propose a Psychology-Guided Environment Aware Network (PEAN) that models social interaction among people in videos using supervised learning. Specifically, we divide the surrounding environment into scene-aware visual-based and human-aware visual-based descriptions. For the scene-aware visual clue, we utilize 3D features as global visual representations. For the human-aware visual clue, we consider instance-based location and behaviour-related visual representations to map human-centered interaction elements in social psychology: distance, openness and orientation. In addition, we design an environment aware mechanism to integrate features from visual clues, with a Transformer to explore the relation between individuals and construct pairwise interaction strength features. The interaction intensity matrix reflecting the mutual nature of the interaction is obtained by processing the interaction strength features with the interaction discovery module. An interaction constrained loss function composed of interaction critical loss function and smooth Fβ loss function is proposed to optimize the whole framework to improve the distinction of the interaction matrix and alleviate class imbalance caused by pairwise interaction sparsity. Given the diversity of real-world interactions, we collect a new dataset named Social Basketball Activity Dataset (Soical-BAD), covering complex social interactions. Our method achieves the best performance among social-CAD, social-BAD, and their combined dataset named Video Social Interaction Dataset (VSID).","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"44 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140590533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

RSUIGM: Realistic Synthetic Underwater Image Generation with Image Formation Model RSUIGM：利用图像形成模型生成逼真的合成水下图像

IF 5.1 3区计算机科学

ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-04-08 DOI: 10.1145/3656473

Chaitra Desai, Sujay Benur, Ujwala Patil, Uma Mudenagudi

{"title":"RSUIGM: Realistic Synthetic Underwater Image Generation with Image Formation Model","authors":"Chaitra Desai, Sujay Benur, Ujwala Patil, Uma Mudenagudi","doi":"10.1145/3656473","DOIUrl":"https://doi.org/10.1145/3656473","url":null,"abstract":"In this paper, we propose to synthesize realistic underwater images with a novel image formation model, considering both downwelling depth and line of sight (LOS) distance as cue and call it as Realistic Synthetic Underwater Image Generation Model, RSUIGM. The light interaction in the ocean is a complex process and demands specific modeling of direct and backscattering phenomenon to capture the degradations. Most of the image formation models rely on complex radiative transfer models and in-situ measurements for synthesizing and restoration of underwater images. Typical image formation models consider only line of sight distance z and ignore downwelling depth d in the estimation of effect of direct light scattering. We derive the dependencies of downwelling irradiance in direct light estimation for generation of synthetic underwater images unlike state-of-the-art image formation models. We propose to incorporate the derived downwelling irradiance in estimation of direct light scattering for modeling the image formation process and generate realistic synthetic underwater images with the proposed RSUIGM, and name it as RSUIGM dataset. We demonstrate the effectiveness of the proposed RSUIGM by using RSUIGM dataset in training deep learning based restoration methods. We compare the quality of restored images with state-of-the-art methods using benchmark real underwater image datasets and achieve improved results. In addition, we validate the distribution of realistic synthetic underwater images versus real underwater images both qualitatively and quantitatively. The proposed RSUIGM dataset is available here.\u0000","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"1 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140590438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

High Fidelity Makeup via 2D and 3D Identity Preservation Net 通过二维和三维身份保护网进行高保真化妆

IF 5.1 3区计算机科学

ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-04-08 DOI: 10.1145/3656475

Jinliang Liu, Zhedong Zheng, Zongxin Yang, Yi Yang

{"title":"High Fidelity Makeup via 2D and 3D Identity Preservation Net","authors":"Jinliang Liu, Zhedong Zheng, Zongxin Yang, Yi Yang","doi":"10.1145/3656475","DOIUrl":"https://doi.org/10.1145/3656475","url":null,"abstract":"In this paper, we address the challenging makeup transfer task, aiming to transfer makeup from a reference image to a source image while preserving facial geometry and background consistency. Existing deep neural network-based methods have shown promising results in aligning facial parts and transferring makeup textures. However, they often neglect the facial geometry of the source image, leading to two adverse effects: (1) alterations in geometrically relevant facial features, causing face flattening and loss of personality, and (2) difficulties in maintaining background consistency, as networks cannot clearly determine the face-background boundary. To jointly tackle these issues, we propose the High Fidelity Makeup via 2D and 3D Identity Preservation Network (IP23-Net), a novel framework that leverages facial geometry information to generate more realistic results. Our method comprises a 3D Shape Identity Encoder, which extracts identity and 3D shape features. We incorporate a 3D face reconstruction model to ensure the three-dimensional effect of face makeup, thereby preserving the characters’ depth and natural appearance. To preserve background consistency, our Background Correction Decoder automatically predicts an adaptive mask for the source image, distinguishing the foreground and background. In addition to popular benchmarks, we introduce a new large-scale High Resolution Synthetic Makeup Dataset containing 335,230 diverse high-resolution face images, to evaluate our method’s generalization ability. Experiments demonstrate that IP23-Net achieves high-fidelity makeup transfer while effectively preserving background consistency. The code will be made publicly available.","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"56 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140603157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Self-Defense Copyright Protection Scheme for NFT Image Art Based on Information Embedding 基于信息嵌入的 NFT 图像艺术版权自卫保护方案

IF 5.1 3区计算机科学

ACM Transactions on Multimedia Computing Communications and Applications Pub Date : 2024-04-06 DOI: 10.1145/3652519

Fan Wang, Zhangjie Fu, Xiang Zhang

{"title":"A Self-Defense Copyright Protection Scheme for NFT Image Art Based on Information Embedding","authors":"Fan Wang, Zhangjie Fu, Xiang Zhang","doi":"10.1145/3652519","DOIUrl":"https://doi.org/10.1145/3652519","url":null,"abstract":"Non-convertible tokens (NFTs) have become a fundamental part of the metaverse ecosystem due to its uniqueness and immutability. However, existing copyright protection schemes of NFT image art relied on the NFTs itself minted by third-party platforms. A minted NFT image art only tracks and verifies the entire transaction process, but the legitimacy of the source and ownership of its mapped digital image art cannot be determined. The original author or authorized publisher lack an active defense mechanism to prove ownership of the digital image art mapped by the unauthorized NFT. Therefore, we propose a self-defense copyright protection scheme for NFT image art based on information embedding in this paper, called SDCP-IE. The original author or authorized publisher can embed the copyright information into the published digital image art without damaging its visual effect in advance. Different from the existing information embedding works, the proposed SDCP-IE can generally enhance the invisibility of copyright information with different embedding capacity. Furthermore, considering the scenario of copyright information being discovered or even destroyed by unauthorized parties, the designed SDCP-IE can efficiently generate enhanced digital image art to improve the security performance of embedded image, thus resisting the detection of multiple known and unknown detection models simultaneously. The experimental results have also shown that the PSNR values of enhanced embedded image are all over 57db on three datasets BOSSBase, BOWS2 and ALASKA#2. Moreover, compared with existing information embedding works, the enhanced embedded images generated by SDCP-IE reaches the best transferability performance on the advanced CNN-based detection models. When the target detector is the pre-trained SRNet at 0.4bpp, the test error rate of SDCP-IE at 0.4bpp on the evaluated detection model YeNet reaches 53.38%, which is 4.92%, 28.62% and 7.05% higher than that of the UTGAN, SPS-ENH and Xie-Model, respectively.","PeriodicalId":50937,"journal":{"name":"ACM Transactions on Multimedia Computing Communications and Applications","volume":"82 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140575166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0