Dmitry Dmitriyevich Averianov, Mikhail Valerievich Zheludev, Vladimir Ilyich Kiyaev
{"title":"基于视频连续帧关键点分析的人物情感形象构建","authors":"Dmitry Dmitriyevich Averianov, Mikhail Valerievich Zheludev, Vladimir Ilyich Kiyaev","doi":"10.21638/11701/spbu35.2023.306","DOIUrl":null,"url":null,"abstract":"The work is devoted to the development of an algorithm for classifying human behavior in the context of detecting the truthfulness or falsity of statements presented in video file format. The analysis of the video file was carried out within the time window, in which both changes in the micromotility of the facial muscles and speech signs were analyzed. In our case, facial expressions are represented by a mathematical representation in the form of a vector containing the necessary digital information about the state of the face, which is characterized by the positions of special points (key points of the nose, eyebrows, eyes, eyelids, etc.). The mimic vector is formed as a result of training non-linear models. The speech characterizing vector is formed on the basis of the heuristic characteristics of the audio signal. The temporal aggregation of vectors for the final classification of behavior is performed by a separate neural network. The paper presents the results of the accuracy and speed of the algorithm, which show that the new approach is competitive with respect to existing methods.","PeriodicalId":36965,"journal":{"name":"Differencialnie Uravnenia i Protsesy Upravlenia","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Construction of an Emotional Image of a Person Based on the Analysis of Key Points in Consecutive Frames of a Video Sequence\",\"authors\":\"Dmitry Dmitriyevich Averianov, Mikhail Valerievich Zheludev, Vladimir Ilyich Kiyaev\",\"doi\":\"10.21638/11701/spbu35.2023.306\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The work is devoted to the development of an algorithm for classifying human behavior in the context of detecting the truthfulness or falsity of statements presented in video file format. The analysis of the video file was carried out within the time window, in which both changes in the micromotility of the facial muscles and speech signs were analyzed. In our case, facial expressions are represented by a mathematical representation in the form of a vector containing the necessary digital information about the state of the face, which is characterized by the positions of special points (key points of the nose, eyebrows, eyes, eyelids, etc.). The mimic vector is formed as a result of training non-linear models. The speech characterizing vector is formed on the basis of the heuristic characteristics of the audio signal. The temporal aggregation of vectors for the final classification of behavior is performed by a separate neural network. The paper presents the results of the accuracy and speed of the algorithm, which show that the new approach is competitive with respect to existing methods.\",\"PeriodicalId\":36965,\"journal\":{\"name\":\"Differencialnie Uravnenia i Protsesy Upravlenia\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Differencialnie Uravnenia i Protsesy Upravlenia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21638/11701/spbu35.2023.306\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Differencialnie Uravnenia i Protsesy Upravlenia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21638/11701/spbu35.2023.306","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
Construction of an Emotional Image of a Person Based on the Analysis of Key Points in Consecutive Frames of a Video Sequence
The work is devoted to the development of an algorithm for classifying human behavior in the context of detecting the truthfulness or falsity of statements presented in video file format. The analysis of the video file was carried out within the time window, in which both changes in the micromotility of the facial muscles and speech signs were analyzed. In our case, facial expressions are represented by a mathematical representation in the form of a vector containing the necessary digital information about the state of the face, which is characterized by the positions of special points (key points of the nose, eyebrows, eyes, eyelids, etc.). The mimic vector is formed as a result of training non-linear models. The speech characterizing vector is formed on the basis of the heuristic characteristics of the audio signal. The temporal aggregation of vectors for the final classification of behavior is performed by a separate neural network. The paper presents the results of the accuracy and speed of the algorithm, which show that the new approach is competitive with respect to existing methods.