Zharas Ainakulov, Kayrat Koshekov, Alexey Savostin, R. Anayatova, B. Seidakhmetov, G. Kurmankulova
{"title":"Development of an advanced ai-based model for human psychoemotional state analysis","authors":"Zharas Ainakulov, Kayrat Koshekov, Alexey Savostin, R. Anayatova, B. Seidakhmetov, G. Kurmankulova","doi":"10.15587/1729-4061.2023.293011","DOIUrl":null,"url":null,"abstract":"The research focuses on developing a novel method for the automatic recognition of human psychoemotional states (PES) using deep learning technology. This method is centered on analyzing speech signals to classify distinct emotional states. The primary challenge addressed by this research is to accurately perform multiclass classification of seven human psychoemotional states, namely joy, fear, anger, sadness, disgust, surprise, and a neutral state. Traditional methods have struggled to accurately distinguish these complex emotional nuances in speech. The study successfully developed a model capable of extracting informative features from audio recordings, specifically mel spectrograms and mel-frequency cepstral coefficients. These features were then used to train two deep convolutional neural networks, resulting in a classifier model. The uniqueness of this research lies in its use of a dual-feature approach and the employment of deep convolutional neural networks for classification. This approach has demonstrated high accuracy in emotion recognition, with an accuracy rate of 0.93 in the validation subset. The high accuracy and effectiveness of the model can be attributed to the comprehensive and synergistic use of mel spectrograms and mel-frequency cepstral coefficients, which provide a more nuanced analysis of emotional expressions in speech. The method presented in this research has broad applicability in various domains, including enhancing human-machine interface interactions, implementation in the aviation industry, healthcare, marketing, and other fields where understanding human emotions through speech is crucial","PeriodicalId":11433,"journal":{"name":"Eastern-European Journal of Enterprise Technologies","volume":"76 6","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Eastern-European Journal of Enterprise Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15587/1729-4061.2023.293011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0
Abstract
The research focuses on developing a novel method for the automatic recognition of human psychoemotional states (PES) using deep learning technology. This method is centered on analyzing speech signals to classify distinct emotional states. The primary challenge addressed by this research is to accurately perform multiclass classification of seven human psychoemotional states, namely joy, fear, anger, sadness, disgust, surprise, and a neutral state. Traditional methods have struggled to accurately distinguish these complex emotional nuances in speech. The study successfully developed a model capable of extracting informative features from audio recordings, specifically mel spectrograms and mel-frequency cepstral coefficients. These features were then used to train two deep convolutional neural networks, resulting in a classifier model. The uniqueness of this research lies in its use of a dual-feature approach and the employment of deep convolutional neural networks for classification. This approach has demonstrated high accuracy in emotion recognition, with an accuracy rate of 0.93 in the validation subset. The high accuracy and effectiveness of the model can be attributed to the comprehensive and synergistic use of mel spectrograms and mel-frequency cepstral coefficients, which provide a more nuanced analysis of emotional expressions in speech. The method presented in this research has broad applicability in various domains, including enhancing human-machine interface interactions, implementation in the aviation industry, healthcare, marketing, and other fields where understanding human emotions through speech is crucial
期刊介绍:
Terminology used in the title of the "East European Journal of Enterprise Technologies" - "enterprise technologies" should be read as "industrial technologies". "Eastern-European Journal of Enterprise Technologies" publishes all those best ideas from the science, which can be introduced in the industry. Since, obtaining the high-quality, competitive industrial products is based on introducing high technologies from various independent spheres of scientific researches, but united by a common end result - a finished high-technology product. Among these scientific spheres, there are engineering, power engineering and energy saving, technologies of inorganic and organic substances and materials science, information technologies and control systems. Publishing scientific papers in these directions are the main development "vectors" of the "Eastern-European Journal of Enterprise Technologies". Since, these are those directions of scientific researches, the results of which can be directly used in modern industrial production: space and aircraft industry, instrument-making industry, mechanical engineering, power engineering, chemical industry and metallurgy.