Detecting incongruity in the expression of emotions in short videos based on a multimodal approach

Q3 Physics and Astronomy
Anastasia Laushkina, Ivan Smirnov, A. Medvedev, A. Laptev, Mikhail Sinko
{"title":"Detecting incongruity in the expression of emotions in short videos based on a multimodal approach","authors":"Anastasia Laushkina, Ivan Smirnov, A. Medvedev, A. Laptev, Mikhail Sinko","doi":"10.35470/2226-4116-2022-11-4-210-216","DOIUrl":null,"url":null,"abstract":"Every day people face uncertainty, which is already an integral part of their lives. Uncertainty creates risks for various kinds of companies, in particular, the financial sector may incur losses due to various kinds of human errors. People turn to the opinion of experts who have special knowledge to eliminate this uncertainty. It is established that the expert shows insolvency if he uses incongruent manipulation techniques. In this article we propose a method that allows solving the problem of congruence estimation. The hypothesis that a person with a prepared speech and a person with a spontaneous speech will have a different level of congruence is also put forward and tested in this work. The similarity of emotional states of verbal and nonverbal channels is evaluated in our solution for determining congruence. Convolutional neural networks (CNN) were used to assess a person’s emotional state from video and audio, speeth-to-text to extract the text of the speaker’s speech, and a pre-trained BERT model for subsequent analysis of emotional color. Tests have shown that with the help of this development it is possible not only to distinguish the incongruence of a person, but also to point out the unnatural nature of his origin (to distinguish a simply incongruent person from a deepfake).","PeriodicalId":37674,"journal":{"name":"Cybernetics and Physics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cybernetics and Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.35470/2226-4116-2022-11-4-210-216","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Physics and Astronomy","Score":null,"Total":0}
引用次数: 0

Abstract

Every day people face uncertainty, which is already an integral part of their lives. Uncertainty creates risks for various kinds of companies, in particular, the financial sector may incur losses due to various kinds of human errors. People turn to the opinion of experts who have special knowledge to eliminate this uncertainty. It is established that the expert shows insolvency if he uses incongruent manipulation techniques. In this article we propose a method that allows solving the problem of congruence estimation. The hypothesis that a person with a prepared speech and a person with a spontaneous speech will have a different level of congruence is also put forward and tested in this work. The similarity of emotional states of verbal and nonverbal channels is evaluated in our solution for determining congruence. Convolutional neural networks (CNN) were used to assess a person’s emotional state from video and audio, speeth-to-text to extract the text of the speaker’s speech, and a pre-trained BERT model for subsequent analysis of emotional color. Tests have shown that with the help of this development it is possible not only to distinguish the incongruence of a person, but also to point out the unnatural nature of his origin (to distinguish a simply incongruent person from a deepfake).
基于多模态方法的短视频情感表达不一致性检测
人们每天都面临着不确定性,这已经是他们生活中不可或缺的一部分。不确定性给各类公司带来了风险,尤其是金融部门可能因各种人为错误而遭受损失。人们求助于有专门知识的专家的意见来消除这种不确定性。可以确定的是,如果专家使用了不协调的操纵技术,则表明他破产了。在本文中,我们提出了一种可以解决同余估计问题的方法。这项工作还提出并检验了一个假设,即一个有准备的演讲的人和一个有自发演讲的人会有不同程度的一致性。在我们确定一致性的解决方案中,评估了言语和非言语渠道的情绪状态的相似性。卷积神经网络(CNN)用于从视频和音频中评估一个人的情绪状态,从语音到文本提取说话人的语音,并使用预先训练的BERT模型进行情绪颜色的后续分析。测试表明,在这一发展的帮助下,不仅可以区分一个人的不协调,还可以指出他的起源的非自然性质(区分一个简单的不协调的人和一个深度伪造的人)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Cybernetics and Physics
Cybernetics and Physics Chemical Engineering-Fluid Flow and Transfer Processes
CiteScore
1.70
自引率
0.00%
发文量
17
审稿时长
10 weeks
期刊介绍: The scope of the journal includes: -Nonlinear dynamics and control -Complexity and self-organization -Control of oscillations -Control of chaos and bifurcations -Control in thermodynamics -Control of flows and turbulence -Information Physics -Cyber-physical systems -Modeling and identification of physical systems -Quantum information and control -Analysis and control of complex networks -Synchronization of systems and networks -Control of mechanical and micromechanical systems -Dynamics and control of plasma, beams, lasers, nanostructures -Applications of cybernetic methods in chemistry, biology, other natural sciences The papers in cybernetics with physical flavor as well as the papers in physics with cybernetic flavor are welcome. Cybernetics is assumed to include, in addition to control, such areas as estimation, filtering, optimization, identification, information theory, pattern recognition and other related areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信