{"title":"Learner’s cognitive state recognition based on multimodal physiological signal fusion","authors":"Yingting Li, Yue Li, Xiuling He, Jing Fang, ChongYang Zhou, Chenxu Liu","doi":"10.1007/s10489-024-05958-1","DOIUrl":null,"url":null,"abstract":"<div><p>It is crucial to evaluate learning outcomes by identifying the cognitive state of the learner during the learning process. Studies utilizing Electroencephalography (EEG) and other peripheral physiological signals, combined with deep learning models, have demonstrated improved performance in cognitive state recognition. These studies have primarily focused on unimodal data, which are vulnerable to various types of noise, making it difficult to fully capture and represent cognitive states. Leveraging the complementarity between multimodal physiological signals can mitigate the impact of anomalies in unimodal data, thereby improving the accuracy and stability of cognitive state recognition. Therefore, this study proposes a multimodal physiological signal feature representation fusion model based on multi-level attention (PSFMMA). The model aims to integrate multimodal physiological signals to identify learners’ cognitive states with greater stability and accuracy. PSFMMA first extracts the temporal features of physiological signals by multiplexing the embedding layer. Subsequently, it generates signal representation vectors by further extracting semantic features through a signal feature mapping layer and enhancing important features with designed attention modules. Finally, the model employs an attention mechanism based on different signal representation vectors to fuse multimodal information for identifying learners’ cognitive states. This study designs various learning activities and collects electroencephalography (EEG), electrodermal activity (EDA), and photoplethysmography (PPG) data from 22 participants engaging in these activities to create the Based on Learning Activities Collection (BLAC) dataset. The proposed model was evaluated on the BLAC dataset, achieving an identification accuracy of 96.32 ± 0.32%. The results demonstrate that the model can effectively recognize learners’ cognitive states. Furthermore, the model’s performance was validated on the publicly available emotion classification dataset DEAP, attaining an accuracy of 99.15 ± 0.12%. The source code is available at https://github.com/chengshudaxuesheng/PSFMMA.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 2","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-024-05958-1","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
It is crucial to evaluate learning outcomes by identifying the cognitive state of the learner during the learning process. Studies utilizing Electroencephalography (EEG) and other peripheral physiological signals, combined with deep learning models, have demonstrated improved performance in cognitive state recognition. These studies have primarily focused on unimodal data, which are vulnerable to various types of noise, making it difficult to fully capture and represent cognitive states. Leveraging the complementarity between multimodal physiological signals can mitigate the impact of anomalies in unimodal data, thereby improving the accuracy and stability of cognitive state recognition. Therefore, this study proposes a multimodal physiological signal feature representation fusion model based on multi-level attention (PSFMMA). The model aims to integrate multimodal physiological signals to identify learners’ cognitive states with greater stability and accuracy. PSFMMA first extracts the temporal features of physiological signals by multiplexing the embedding layer. Subsequently, it generates signal representation vectors by further extracting semantic features through a signal feature mapping layer and enhancing important features with designed attention modules. Finally, the model employs an attention mechanism based on different signal representation vectors to fuse multimodal information for identifying learners’ cognitive states. This study designs various learning activities and collects electroencephalography (EEG), electrodermal activity (EDA), and photoplethysmography (PPG) data from 22 participants engaging in these activities to create the Based on Learning Activities Collection (BLAC) dataset. The proposed model was evaluated on the BLAC dataset, achieving an identification accuracy of 96.32 ± 0.32%. The results demonstrate that the model can effectively recognize learners’ cognitive states. Furthermore, the model’s performance was validated on the publicly available emotion classification dataset DEAP, attaining an accuracy of 99.15 ± 0.12%. The source code is available at https://github.com/chengshudaxuesheng/PSFMMA.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.