{"title":"Multimodal Emotion Fusion Mechanism and Empathetic Responses in Companion Robots","authors":"Xiaofeng Liu;Qincheng Lv;Jie Li;Siyang Song;Angelo Cangelosi","doi":"10.1109/TCDS.2024.3442203","DOIUrl":null,"url":null,"abstract":"The ability of humanoid robots to exhibit empathetic facial expressions and provide corresponding responses is essential for natural human–robot interaction. To enhance this, we integrate the GPT3.5 model with a facial expression recognition model, creating a multimodal emotion recognition system. Additionally, we address the challenge of realistically mimicking human facial expressions by designing the physical structure of a humanoid robot. Initially, we develop a humanoid robot capable of adjusting the positions of its facial organs and neck through servo displacement to achieve more natural facial expressions. Subsequently, to overcome the current limitation where emotional interaction robots struggle to accurately recognize user emotions, we introduce a coupled generative pretrained transformer (GPT)-based multimodal emotion recognition method that utilizes both text and images, thereby enhancing the robot's emotion recognition accuracy. Finally, we integrate the GPT-3.5 model to generate empathetic responses based on recognized user emotional states and language text, which are then mapped onto the robot to enable empathetic expressions that can achieve a more comfortable human–machine interaction experience. Experimental results on benchmark databases demonstrate that the performance of the coupled GPT-based multimodal emotion recognition method using text and images outperforms other approaches, and it possesses unique empathetic response capabilities relative to alternative methods.","PeriodicalId":54300,"journal":{"name":"IEEE Transactions on Cognitive and Developmental Systems","volume":"17 2","pages":"271-286"},"PeriodicalIF":4.9000,"publicationDate":"2024-08-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive and Developmental Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10634513/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The ability of humanoid robots to exhibit empathetic facial expressions and provide corresponding responses is essential for natural human–robot interaction. To enhance this, we integrate the GPT3.5 model with a facial expression recognition model, creating a multimodal emotion recognition system. Additionally, we address the challenge of realistically mimicking human facial expressions by designing the physical structure of a humanoid robot. Initially, we develop a humanoid robot capable of adjusting the positions of its facial organs and neck through servo displacement to achieve more natural facial expressions. Subsequently, to overcome the current limitation where emotional interaction robots struggle to accurately recognize user emotions, we introduce a coupled generative pretrained transformer (GPT)-based multimodal emotion recognition method that utilizes both text and images, thereby enhancing the robot's emotion recognition accuracy. Finally, we integrate the GPT-3.5 model to generate empathetic responses based on recognized user emotional states and language text, which are then mapped onto the robot to enable empathetic expressions that can achieve a more comfortable human–machine interaction experience. Experimental results on benchmark databases demonstrate that the performance of the coupled GPT-based multimodal emotion recognition method using text and images outperforms other approaches, and it possesses unique empathetic response capabilities relative to alternative methods.
期刊介绍:
The IEEE Transactions on Cognitive and Developmental Systems (TCDS) focuses on advances in the study of development and cognition in natural (humans, animals) and artificial (robots, agents) systems. It welcomes contributions from multiple related disciplines including cognitive systems, cognitive robotics, developmental and epigenetic robotics, autonomous and evolutionary robotics, social structures, multi-agent and artificial life systems, computational neuroscience, and developmental psychology. Articles on theoretical, computational, application-oriented, and experimental studies as well as reviews in these areas are considered.