{"title":"Emotion Detection using Speech and Face in Deep Learning","authors":"S. Shajith Ahamed, J. Jabez, M. Prithiviraj","doi":"10.1109/ICSCSS57650.2023.10169784","DOIUrl":null,"url":null,"abstract":"Humans have a unique ability to demonstrate and understand emotions through a variety of models of communication. Based on their emotions or mood swings we can judge whether the human subject is in good psychological condition or not. The most visible apparent deficiencies of today’s Emotion capturing systems were their inability to understand the emotions of such patients like mental health disorder, social emotion Agnosia, alexithymia or even autism by using facial expressions. It can be used in schools to help students who find it difficult to express their feelings (introverts) or who have unstable mental health concerns, such as depression, and hence the teacher’s or health workers can communicate with their parents and work through their problems. These days, technology allows employers to recognize individuals who are overly stressed in the workplace and release them from their duties. In research work a Deep Learning algorithm is utilized to create an integrated tool to identify the facial emotions and the stress level or emotion quotient from speech. Tools that can assist people in recognizing the emotions of those around them could be very beneficial in treatment settings as well as in regular social encounters. Emotion detection using speech and face in deep learning has made significant progress in recent years, but there are still several challenges that need to be addressed. Here are some of the main challenges: Limited Dataset: The availability of labeled datasets for emotion detection is limited, especially for less common emotions or for specific cultural contexts. This makes it challenging to train deep learning models that can generalize well to new data. Variability in Data: The data used for emotion detection can vary widely in terms of quality, noise, and variability. For example, speech data can be affected by environmental noise, accents, and speaking styles, while facial data can be affected by lighting conditions, facial expressions, and occlusion. Feature Extraction: Extracting relevant features from speech and facial data can be challenging, especially when dealing with complex emotions that are not easily captured by simple features. This requires careful design of feature extraction algorithms and feature engineering techniques. Interpretability: Deep learning models are often seen as “black boxes” that are difficult to interpret. This can make it challenging to understand how the model is making decisions and to diagnose errors or biases in the model.Ethical and Privacy Concerns: Emotion detection using speech and facial data raises ethical and privacy concerns, as it can be used for sensitive applications such as surveillance, emotion profiling, and behavioral prediction. This requires careful consideration of ethical and privacy issues in the design and deployment of deep learning models for emotion detection.","PeriodicalId":217957,"journal":{"name":"2023 International Conference on Sustainable Computing and Smart Systems (ICSCSS)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Sustainable Computing and Smart Systems (ICSCSS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSCSS57650.2023.10169784","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Humans have a unique ability to demonstrate and understand emotions through a variety of models of communication. Based on their emotions or mood swings we can judge whether the human subject is in good psychological condition or not. The most visible apparent deficiencies of today’s Emotion capturing systems were their inability to understand the emotions of such patients like mental health disorder, social emotion Agnosia, alexithymia or even autism by using facial expressions. It can be used in schools to help students who find it difficult to express their feelings (introverts) or who have unstable mental health concerns, such as depression, and hence the teacher’s or health workers can communicate with their parents and work through their problems. These days, technology allows employers to recognize individuals who are overly stressed in the workplace and release them from their duties. In research work a Deep Learning algorithm is utilized to create an integrated tool to identify the facial emotions and the stress level or emotion quotient from speech. Tools that can assist people in recognizing the emotions of those around them could be very beneficial in treatment settings as well as in regular social encounters. Emotion detection using speech and face in deep learning has made significant progress in recent years, but there are still several challenges that need to be addressed. Here are some of the main challenges: Limited Dataset: The availability of labeled datasets for emotion detection is limited, especially for less common emotions or for specific cultural contexts. This makes it challenging to train deep learning models that can generalize well to new data. Variability in Data: The data used for emotion detection can vary widely in terms of quality, noise, and variability. For example, speech data can be affected by environmental noise, accents, and speaking styles, while facial data can be affected by lighting conditions, facial expressions, and occlusion. Feature Extraction: Extracting relevant features from speech and facial data can be challenging, especially when dealing with complex emotions that are not easily captured by simple features. This requires careful design of feature extraction algorithms and feature engineering techniques. Interpretability: Deep learning models are often seen as “black boxes” that are difficult to interpret. This can make it challenging to understand how the model is making decisions and to diagnose errors or biases in the model.Ethical and Privacy Concerns: Emotion detection using speech and facial data raises ethical and privacy concerns, as it can be used for sensitive applications such as surveillance, emotion profiling, and behavioral prediction. This requires careful consideration of ethical and privacy issues in the design and deployment of deep learning models for emotion detection.