2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)最新文献

筛选
英文 中文
Expressive Speech-Driven Facial Animation with Controllable Emotions 具有可控情绪的表情语言驱动的面部动画
2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2023-01-05 DOI: 10.1109/ICMEW59549.2023.00073
Yutong Chen, Junhong Zhao, Weiqiang Zhang
{"title":"Expressive Speech-Driven Facial Animation with Controllable Emotions","authors":"Yutong Chen, Junhong Zhao, Weiqiang Zhang","doi":"10.1109/ICMEW59549.2023.00073","DOIUrl":"https://doi.org/10.1109/ICMEW59549.2023.00073","url":null,"abstract":"It is in high demand to generate facial animation with high realism, but it remains a challenging task. Existing approaches of speech-driven facial animation can produce satisfactory mouth movement and lip synchronization, but show weakness in dramatic emotional expressions and flexibility in emotion control. This paper presents a novel deep learning-based approach for expressive facial animation generation from speech that can exhibit wide-spectrum facial expressions with controllable emotion type and intensity. We propose an emotion controller module to learn the relationship between the emotion variations (e.g., types and intensity) and the corresponding facial expression parameters. It enables emotion-controllable facial animation, where the target expression can be continuously adjusted as desired. The qualitative and quantitative evaluations show that the animation generated by our method is rich in facial emotional expressiveness while retaining accurate lip movement, outperforming other state-of-the-art methods.","PeriodicalId":111482,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123727257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Skeletonmae: Spatial-Temporal Masked Autoencoders for Self-Supervised Skeleton Action Recognition 用于自监督骨骼动作识别的时空掩码自编码器
2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-09-01 DOI: 10.1109/ICMEW59549.2023.00045
Wenhan Wu, Yilei Hua, Ce Zheng, Shi-Bao Wu, Chen Chen, Aidong Lu
{"title":"Skeletonmae: Spatial-Temporal Masked Autoencoders for Self-Supervised Skeleton Action Recognition","authors":"Wenhan Wu, Yilei Hua, Ce Zheng, Shi-Bao Wu, Chen Chen, Aidong Lu","doi":"10.1109/ICMEW59549.2023.00045","DOIUrl":"https://doi.org/10.1109/ICMEW59549.2023.00045","url":null,"abstract":"Self-supervised skeleton-based action recognition has attracted more attention in recent years. By utilizing the unlabeled data, more generalizable features can be learned to alleviate the overfitting problem and reduce the demand for massive labeled training data. Inspired by the MAE [1], we propose a spatial-temporal masked autoencoder framework for self-supervised 3D skeleton-based action recognition (SkeletonMAE). Following MAE's masking and reconstruction pipeline, we utilize a skeleton-based encoder-decoder transformer architecture to reconstruct the masked skeleton sequences. A novel masking strategy, named Spatial-Temporal Masking, is introduced in terms of both joint-level and frame-level for the skeleton sequence. This pre-training strategy makes the encoder output generalizable skeleton features with spatial and temporal dependencies. Given the unmasked skeleton sequence, the encoder is fine-tuned for the action recognition task. Extensive experiments show that our SkeletonMAE achieves remarkable performance and outperforms the state-of-the-art methods on both NTU RGB+D 60 and NTU RGB+D 120 datasets.","PeriodicalId":111482,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"242 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122523007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Aesthetic Visual Question Answering of Photographs 摄影美学视觉问答
2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) Pub Date : 2022-08-10 DOI: 10.1109/ICMEW59549.2023.00068
Xin Jin, Wu Zhou, Xinghui Zhou, Shuai Cui, Le Zhang, Jianwen Lv, Shu Zhao
{"title":"Aesthetic Visual Question Answering of Photographs","authors":"Xin Jin, Wu Zhou, Xinghui Zhou, Shuai Cui, Le Zhang, Jianwen Lv, Shu Zhao","doi":"10.1109/ICMEW59549.2023.00068","DOIUrl":"https://doi.org/10.1109/ICMEW59549.2023.00068","url":null,"abstract":"Aesthetic assessment of images can be categorized into two main forms: numerical assessment and language assessment. In this paper, we propose a new task of aesthetic language assessment: aesthetic visual question and answering (AVQA) of images. We use images from www.flickr.com. The objective QA pairs are generated by the proposed aesthetic attributes analysis algorithms. Moreover, we introduce subjective QA pairs that are converted from aesthetic numerical labels and sentiment analysis from large-scale pre-train models. We build the first aesthetic visual question answering dataset, AesVQA, that contains 72,168 high-quality images and 324,756 pairs of aesthetic questions. This is the first work that both addresses the task of aesthetic VQA and introduces subjectiveness into VQA tasks. The experimental results reveal that our methods outperform other VQA models on this new task.","PeriodicalId":111482,"journal":{"name":"2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133184370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信