从屏障功能看 "会说话的头 "的皮肤弹性

IF 4.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Iti Chaturvedi, Vlad Pandelea, Erik Cambria, Roy Welsch, Bithin Datta
{"title":"从屏障功能看 \"会说话的头 \"的皮肤弹性","authors":"Iti Chaturvedi, Vlad Pandelea, Erik Cambria, Roy Welsch, Bithin Datta","doi":"10.1007/s12559-024-10344-7","DOIUrl":null,"url":null,"abstract":"<p>In this paper, we target the problem of generating facial expressions from a piece of audio. This is challenging since both audio and video have inherent characteristics that are distinct from the other. Some words may have identical lip movements, and speech impediments may prevent lip-reading in some individuals. Previous approaches to generating such a talking head suffered from stiff expressions. This is because they focused only on lip movements and the facial landmarks did not contain the information flow from the audio. Hence, in this work, we employ spatio-temporal independent component analysis to accurately sync the audio with the corresponding face video. Proper word formation also requires control over the face muscles that can be captured using a barrier function. We first validated the approach on the diffusion of salt water in coastal areas using a synthetic finite element simulation. Next, we applied it to 3D facial expressions in toddlers for which training data is difficult to capture. Prior knowledge in the form of rules is specified using Fuzzy logic, and multi-objective optimization is used to collectively learn a set of rules. We observed significantly higher F-measure on three real-world problems.</p>","PeriodicalId":51243,"journal":{"name":"Cognitive Computation","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Barrier Function to Skin Elasticity in Talking Head\",\"authors\":\"Iti Chaturvedi, Vlad Pandelea, Erik Cambria, Roy Welsch, Bithin Datta\",\"doi\":\"10.1007/s12559-024-10344-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In this paper, we target the problem of generating facial expressions from a piece of audio. This is challenging since both audio and video have inherent characteristics that are distinct from the other. Some words may have identical lip movements, and speech impediments may prevent lip-reading in some individuals. Previous approaches to generating such a talking head suffered from stiff expressions. This is because they focused only on lip movements and the facial landmarks did not contain the information flow from the audio. Hence, in this work, we employ spatio-temporal independent component analysis to accurately sync the audio with the corresponding face video. Proper word formation also requires control over the face muscles that can be captured using a barrier function. We first validated the approach on the diffusion of salt water in coastal areas using a synthetic finite element simulation. Next, we applied it to 3D facial expressions in toddlers for which training data is difficult to capture. Prior knowledge in the form of rules is specified using Fuzzy logic, and multi-objective optimization is used to collectively learn a set of rules. We observed significantly higher F-measure on three real-world problems.</p>\",\"PeriodicalId\":51243,\"journal\":{\"name\":\"Cognitive Computation\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cognitive Computation\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s12559-024-10344-7\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Computation","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s12559-024-10344-7","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

在本文中,我们的目标是解决从一段音频生成面部表情的问题。这是一个具有挑战性的问题,因为音频和视频都具有不同的固有特征。有些单词的唇部动作可能完全相同,有些人可能因为语言障碍而无法读唇。以前生成这种 "话头 "的方法存在表情僵硬的问题。这是因为它们只关注唇部动作,而面部地标并不包含音频信息流。因此,在这项工作中,我们采用了时空独立分量分析法,以准确同步音频和相应的面部视频。正确的构词还需要对面部肌肉的控制,这可以通过障碍函数来捕捉。我们首先利用合成有限元模拟对沿海地区的盐水扩散进行了验证。接着,我们将其应用于难以获取训练数据的幼儿三维面部表情。先验知识以规则的形式使用模糊逻辑进行指定,并使用多目标优化来共同学习一组规则。我们在三个实际问题上观察到了明显更高的 F-measure。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Barrier Function to Skin Elasticity in Talking Head

Barrier Function to Skin Elasticity in Talking Head

In this paper, we target the problem of generating facial expressions from a piece of audio. This is challenging since both audio and video have inherent characteristics that are distinct from the other. Some words may have identical lip movements, and speech impediments may prevent lip-reading in some individuals. Previous approaches to generating such a talking head suffered from stiff expressions. This is because they focused only on lip movements and the facial landmarks did not contain the information flow from the audio. Hence, in this work, we employ spatio-temporal independent component analysis to accurately sync the audio with the corresponding face video. Proper word formation also requires control over the face muscles that can be captured using a barrier function. We first validated the approach on the diffusion of salt water in coastal areas using a synthetic finite element simulation. Next, we applied it to 3D facial expressions in toddlers for which training data is difficult to capture. Prior knowledge in the form of rules is specified using Fuzzy logic, and multi-objective optimization is used to collectively learn a set of rules. We observed significantly higher F-measure on three real-world problems.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Cognitive Computation
Cognitive Computation COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-NEUROSCIENCES
CiteScore
9.30
自引率
3.70%
发文量
116
审稿时长
>12 weeks
期刊介绍: Cognitive Computation is an international, peer-reviewed, interdisciplinary journal that publishes cutting-edge articles describing original basic and applied work involving biologically-inspired computational accounts of all aspects of natural and artificial cognitive systems. It provides a new platform for the dissemination of research, current practices and future trends in the emerging discipline of cognitive computation that bridges the gap between life sciences, social sciences, engineering, physical and mathematical sciences, and humanities.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信