ACM Multimedia Asia最新文献_第6页

Latent Pattern Sensing: Deepfake Video Detection via Predictive Representation Learning 潜在模式感知:基于预测表示学习的深度假视频检测

ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3490586

Shiming Ge, Fanzhao Lin, Chenyu Li, Daichi Zhang, Jiyong Tan, Weiping Wang, Dan Zeng

{"title":"Latent Pattern Sensing: Deepfake Video Detection via Predictive Representation Learning","authors":"Shiming Ge, Fanzhao Lin, Chenyu Li, Daichi Zhang, Jiyong Tan, Weiping Wang, Dan Zeng","doi":"10.1145/3469877.3490586","DOIUrl":"https://doi.org/10.1145/3469877.3490586","url":null,"abstract":"Increasingly advanced deepfake approaches have made the detection of deepfake videos very challenging. We observe that the general deepfake videos often exhibit appearance-level temporal inconsistencies in some facial components between frames, resulting in discriminable spatiotemporal latent patterns among semantic-level feature maps. Inspired by this finding, we propose a predictive representative learning approach termed Latent Pattern Sensing to capture these semantic change characteristics for deepfake video detection. The approach cascades a CNN-based encoder, a ConvGRU-based aggregator and a single-layer binary classifier. The encoder and aggregator are pre-trained in a self-supervised manner to form the representative spatiotemporal context features. Finally, the classifier is trained to classify the context features, distinguishing fake videos from real ones. In this manner, the extracted features can simultaneously describe the latent patterns of videos across frames spatially and temporally in a unified way, leading to an effective deepfake video detector. Extensive experiments prove our approach’s effectiveness, e.g., surpassing 10 state-of-the-arts at least 7.92%@AUC on challenging Celeb-DF(v2) benchmark.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125289993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Dedark+Detection: A Hybrid Scheme for Object Detection under Low-light Surveillance Dedark+Detection:一种低光监视下的混合目标检测方案

ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3497691

Xiaolei Luo, S. Xiang, Yingfeng Wang, Qiong Liu, You Yang, Kejun Wu

引用次数: 0

Zero-shot Recognition with Image Attributes Generation using Hierarchical Coupled Dictionary Learning 使用层次耦合字典学习生成图像属性的零射击识别

ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3490613

Shuang Li, Lichun Wang, Shaofan Wang, Dehui Kong, Baocai Yin

{"title":"Zero-shot Recognition with Image Attributes Generation using Hierarchical Coupled Dictionary Learning","authors":"Shuang Li, Lichun Wang, Shaofan Wang, Dehui Kong, Baocai Yin","doi":"10.1145/3469877.3490613","DOIUrl":"https://doi.org/10.1145/3469877.3490613","url":null,"abstract":"Zero-shot learning (ZSL) aims to recognize images from unseen (novel) classes with the training images from seen classes. The attributes of each class is exploited as auxiliary semantic information. Recently most ZSL approaches focus on learning visual-semantic embeddings to transfer knowledge from the seen classes to the unseen classes. However, few works study whether the auxiliary semantic information in the class-level is extensive enough or not for the ZSL task. To tackle such problem, we propose a hierarchical coupled dictionary learning (HCDL) approach to hierarchically align the visual-semantic structures in both the class-level and the image-level. Firstly, the class-level coupled dictionary is trained to establish a basic connection between visual space and semantic space. Then, the image attributes are generated based on the basic connection. Finally, the fine-grained information can be embedded by training the image-level coupled dictionary. Zero-shot recognition is performed in multiple spaces by searching the nearest neighbor class of the unseen image. Experiments on two widely used benchmark datasets show the effectiveness of the proposed approach.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133254279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Prediction of Transcription Factor Binding Sites Using Deep Learning Combined with DNA Sequences and Shape Feature Data 结合DNA序列和形状特征数据的深度学习预测转录因子结合位点

ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3497696

Yangyang Li, Jie Liu, Hao Liu

引用次数: 0

Delay-sensitive and Priority-aware Transmission Control for Real-time Multimedia Communications 实时多媒体通信的延迟敏感和优先级感知传输控制

ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3493597

Ximing Wu, Lei Zhang, Yingfeng Wu, Haobin Zhou, Laizhong Cui

引用次数: 0

PBNet: Position-specific Text-to-image Generation by Boundary PBNet:根据边界生成特定位置的文本到图像

ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3493594

Tian Tian, Li Liu, Huaxiang Zhang, Dongmei Liu

引用次数: 0

Inter-modality Discordance for Multimodal Fake News Detection 多模态假新闻检测中的多模态不一致

ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3490614

Shivangi Singhal, Mudit Dhawan, R. Shah, P. Kumaraguru

{"title":"Inter-modality Discordance for Multimodal Fake News Detection","authors":"Shivangi Singhal, Mudit Dhawan, R. Shah, P. Kumaraguru","doi":"10.1145/3469877.3490614","DOIUrl":"https://doi.org/10.1145/3469877.3490614","url":null,"abstract":"The paradigm shift in the consumption of news via online platforms has cultivated the growth of digital journalism. Contrary to traditional media, lowering entry barriers and enabling everyone to be part of content creation have disabled the concept of centralized gatekeeping in digital journalism. This in turn has triggered the production of fake news. Current studies have made a significant effort towards multimodal fake news detection with less emphasis on exploring the discordance between the different multimedia present in a news article. We hypothesize that fabrication of either modality will lead to dissonance between the modalities, and resulting in misrepresented, misinterpreted and misleading news. In this paper, we inspect the authenticity of news coming from online media outlets by exploiting relationship (discordance) between the textual and multiple visual cues. We develop an inter-modality discordance based fake news detection framework to achieve the goal. The modal-specific discriminative features are learned, employing the cross-entropy loss and a modified version of contrastive loss that explores the inter-modality discordance. To the best of our knowledge, this is the first work that leverages information from different components of the news article (i.e., headline, body, and multiple images) for multimodal fake news detection. We conduct extensive experiments on the real-world datasets to show that our approach outperforms the state-of-the-art by an average F1-score of 6.3%.","PeriodicalId":210974,"journal":{"name":"ACM Multimedia Asia","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124346528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

BAND: A Benchmark Dataset forBangla News Audio Classification BAND:孟加拉语新闻音频分类的基准数据集

ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3490575

Md. Rafi Ur Rashid, Mahim Mahbub, Muhammad Abdullah Adnan

引用次数: 1

Color Image Denoising via Tensor Robust PCA with Nonconvex and Nonlocal Regularization 基于非凸非局部正则化张量鲁棒PCA的彩色图像去噪

ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3493592

Xiaoyu Geng, Q. Guo, Cai-ming Zhang

引用次数: 0

Making Video Recognition Models Robust to Common Corruptions With Supervised Contrastive Learning 利用监督对比学习使视频识别模型对常见腐败具有鲁棒性

ACM Multimedia Asia Pub Date : 2021-12-01 DOI: 10.1145/3469877.3497692

Tomu Hirata, Yusuke Mukuta, Tatsuya Harada

引用次数: 2