时间在面部动态中的作用及其在自动情绪识别中的挑战（2019-2024）

IF 4.3 Q1 PSYCHOLOGY, MULTIDISCIPLINARY

Human Behavior and Emerging Technologies Pub Date : 2025-06-25 DOI:10.1155/hbe2/7777949

Williams Contreras-Higuera, Lucrezia Crescenzi-Lanna

{"title":"时间在面部动态中的作用及其在自动情绪识别中的挑战（2019-2024）","authors":"Williams Contreras-Higuera, Lucrezia Crescenzi-Lanna","doi":"10.1155/hbe2/7777949","DOIUrl":null,"url":null,"abstract":"<p>Based on a comprehensive literature review, this study highlights the critical role of the temporal dimension of facial dynamics in understanding facial expressions and improving the accuracy and robustness of automatic emotion recognition systems (machine-FER). While deep learning (DL) techniques like convolutional neural networks (CNNs) and long short-term memory (LSTM) networks offer significant advances, they face challenges such as gradient vanishing and overfitting, particularly in long and complex sequences. Vision transformers (ViTs) show promise but require integration with algorithms to mitigate spatial noise. Conventional machine learning (CML) methods like support vector machine (SVM) remain robust, especially in smaller datasets. The study underscores the importance of multimodal data synchronization (e.g., video, voice) in classifying emotions more accurately, reflecting both human and machine learning capabilities. It also addresses the limitations of current models, including cultural biases and the need for large, diverse datasets. The findings suggest that future research should focus on real-world conditions, integrating sequential multimodal data and employing supervised models based on theoretical and empirical frameworks. This approach is aimed at enhancing the understanding and classification of facial emotions, ensuring data quality and acceptable results through systematic human observations. The study provides valuable insights for selecting appropriate algorithms that are tailored to specific research objectives and contexts.</p>","PeriodicalId":36408,"journal":{"name":"Human Behavior and Emerging Technologies","volume":"2025 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2025-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/hbe2/7777949","citationCount":"0","resultStr":"{\"title\":\"The Role of Time in Facial Dynamics and Challenges in Automatic Emotion Recognition (2019–2024)\",\"authors\":\"Williams Contreras-Higuera, Lucrezia Crescenzi-Lanna\",\"doi\":\"10.1155/hbe2/7777949\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Based on a comprehensive literature review, this study highlights the critical role of the temporal dimension of facial dynamics in understanding facial expressions and improving the accuracy and robustness of automatic emotion recognition systems (machine-FER). While deep learning (DL) techniques like convolutional neural networks (CNNs) and long short-term memory (LSTM) networks offer significant advances, they face challenges such as gradient vanishing and overfitting, particularly in long and complex sequences. Vision transformers (ViTs) show promise but require integration with algorithms to mitigate spatial noise. Conventional machine learning (CML) methods like support vector machine (SVM) remain robust, especially in smaller datasets. The study underscores the importance of multimodal data synchronization (e.g., video, voice) in classifying emotions more accurately, reflecting both human and machine learning capabilities. It also addresses the limitations of current models, including cultural biases and the need for large, diverse datasets. The findings suggest that future research should focus on real-world conditions, integrating sequential multimodal data and employing supervised models based on theoretical and empirical frameworks. This approach is aimed at enhancing the understanding and classification of facial emotions, ensuring data quality and acceptable results through systematic human observations. The study provides valuable insights for selecting appropriate algorithms that are tailored to specific research objectives and contexts.</p>\",\"PeriodicalId\":36408,\"journal\":{\"name\":\"Human Behavior and Emerging Technologies\",\"volume\":\"2025 1\",\"pages\":\"\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-06-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1155/hbe2/7777949\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Human Behavior and Emerging Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1155/hbe2/7777949\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"PSYCHOLOGY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Human Behavior and Emerging Technologies","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/hbe2/7777949","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

在综合文献综述的基础上，本研究强调了面部动态的时间维度在理解面部表情和提高自动情绪识别系统（machine-FER）的准确性和鲁棒性方面的关键作用。虽然卷积神经网络（cnn）和长短期记忆（LSTM）网络等深度学习（DL）技术取得了重大进展，但它们面临着梯度消失和过拟合等挑战，特别是在长而复杂的序列中。视觉变压器（ViTs）显示出良好的前景，但需要集成算法来减轻空间噪声。传统的机器学习（CML）方法，如支持向量机（SVM）仍然是鲁棒的，特别是在较小的数据集中。该研究强调了多模态数据同步（如视频、语音）在更准确地分类情绪方面的重要性，反映了人类和机器学习的能力。它还解决了当前模型的局限性，包括文化偏见和对大型、多样化数据集的需求。研究结果表明，未来的研究应着眼于现实条件，整合序列多模态数据，采用基于理论和实证框架的监督模型。该方法旨在通过系统的人类观察，增强对面部情绪的理解和分类，确保数据质量和可接受的结果。该研究为选择适合特定研究目标和背景的适当算法提供了有价值的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

The Role of Time in Facial Dynamics and Challenges in Automatic Emotion Recognition (2019–2024)

查看原文本刊更多论文

The Role of Time in Facial Dynamics and Challenges in Automatic Emotion Recognition (2019–2024)

Based on a comprehensive literature review, this study highlights the critical role of the temporal dimension of facial dynamics in understanding facial expressions and improving the accuracy and robustness of automatic emotion recognition systems (machine-FER). While deep learning (DL) techniques like convolutional neural networks (CNNs) and long short-term memory (LSTM) networks offer significant advances, they face challenges such as gradient vanishing and overfitting, particularly in long and complex sequences. Vision transformers (ViTs) show promise but require integration with algorithms to mitigate spatial noise. Conventional machine learning (CML) methods like support vector machine (SVM) remain robust, especially in smaller datasets. The study underscores the importance of multimodal data synchronization (e.g., video, voice) in classifying emotions more accurately, reflecting both human and machine learning capabilities. It also addresses the limitations of current models, including cultural biases and the need for large, diverse datasets. The findings suggest that future research should focus on real-world conditions, integrating sequential multimodal data and employing supervised models based on theoretical and empirical frameworks. This approach is aimed at enhancing the understanding and classification of facial emotions, ensuring data quality and acceptable results through systematic human observations. The study provides valuable insights for selecting appropriate algorithms that are tailored to specific research objectives and contexts.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Human Behavior and Emerging Technologies Social Sciences-Social Sciences (all)

CiteScore

17.20

自引率

8.70%

发文量

期刊介绍： Human Behavior and Emerging Technologies is an interdisciplinary journal dedicated to publishing high-impact research that enhances understanding of the complex interactions between diverse human behavior and emerging digital technologies.