卷积神经网络在脑电想象音素识别系统中的应用

Ana-Luiza Rusnac, O. Grigore
{"title":"卷积神经网络在脑电想象音素识别系统中的应用","authors":"Ana-Luiza Rusnac, O. Grigore","doi":"10.1109/ATEE52255.2021.9425217","DOIUrl":null,"url":null,"abstract":"Speech is a skill that most of the time we take for granted. In reality, this ability is a complex mechanism which requires thoughts to be translated into words who are further transposed into sounds, a mechanism which involves precise coordination of several muscles and joints. In some cases, this complex mechanism can no longer be performed and may be accompanied by almost complete loss of motor activity such in diseases as: stroke, Lock-Down syndrome, amyotrophic lateral sclerosis, cerebral palsy etc. The most recent method that aims to supplement the speech mechanism is imaginary speech recognition using electroencephalographic (EEG) signals, by using complex computing mechanisms like Deep Learning (DL) in order to decode the thoughts. In this paper we aim to recognize three types of clustered phonemes using conventional speech recognition techniques, like Mel-Cepstral Coefficients (MFCC) and Linear Predictive Coding (LPC) combined with Convolutional Neural Networks (CNN). We compared four types of features extraction: MFCC, LPC, MFCC+ LPC combined into 1-channel matrix and MFCC+ LPC combined into a 2-channel matrix. We showed that MFCC coefficients offer a better accuracy than LPC and that concatenating MFCC and LPC into a 2-channel matrix we obtain a better performance than combining them into 1-channel matrix.","PeriodicalId":359645,"journal":{"name":"2021 12th International Symposium on Advanced Topics in Electrical Engineering (ATEE)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Convolutional Neural Network applied in EEG imagined phoneme recognition system\",\"authors\":\"Ana-Luiza Rusnac, O. Grigore\",\"doi\":\"10.1109/ATEE52255.2021.9425217\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Speech is a skill that most of the time we take for granted. In reality, this ability is a complex mechanism which requires thoughts to be translated into words who are further transposed into sounds, a mechanism which involves precise coordination of several muscles and joints. In some cases, this complex mechanism can no longer be performed and may be accompanied by almost complete loss of motor activity such in diseases as: stroke, Lock-Down syndrome, amyotrophic lateral sclerosis, cerebral palsy etc. The most recent method that aims to supplement the speech mechanism is imaginary speech recognition using electroencephalographic (EEG) signals, by using complex computing mechanisms like Deep Learning (DL) in order to decode the thoughts. In this paper we aim to recognize three types of clustered phonemes using conventional speech recognition techniques, like Mel-Cepstral Coefficients (MFCC) and Linear Predictive Coding (LPC) combined with Convolutional Neural Networks (CNN). We compared four types of features extraction: MFCC, LPC, MFCC+ LPC combined into 1-channel matrix and MFCC+ LPC combined into a 2-channel matrix. We showed that MFCC coefficients offer a better accuracy than LPC and that concatenating MFCC and LPC into a 2-channel matrix we obtain a better performance than combining them into 1-channel matrix.\",\"PeriodicalId\":359645,\"journal\":{\"name\":\"2021 12th International Symposium on Advanced Topics in Electrical Engineering (ATEE)\",\"volume\":\"134 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 12th International Symposium on Advanced Topics in Electrical Engineering (ATEE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ATEE52255.2021.9425217\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 12th International Symposium on Advanced Topics in Electrical Engineering (ATEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ATEE52255.2021.9425217","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

大多数时候,我们认为说话是一种理所当然的技能。实际上,这种能力是一种复杂的机制,需要将思想转化为文字,再转化为声音,这种机制涉及到几块肌肉和关节的精确协调。在某些情况下,这种复杂的机制不能再发挥作用,并可能伴随运动活动几乎完全丧失,如中风、锁定综合征、肌萎缩侧索硬化症、脑瘫等疾病。最近一种旨在补充语音机制的方法是使用脑电图(EEG)信号的想象语音识别,通过使用复杂的计算机制(如深度学习(DL))来解码思想。在本文中,我们的目标是使用传统的语音识别技术,如Mel-Cepstral系数(MFCC)和线性预测编码(LPC)结合卷积神经网络(CNN)来识别三种类型的聚类音素。我们比较了MFCC、LPC、MFCC+ LPC组合成1通道矩阵和MFCC+ LPC组合成2通道矩阵四种特征提取方式。我们发现MFCC系数比LPC具有更好的精度,并且将MFCC和LPC连接成一个2通道矩阵比将它们组合成一个1通道矩阵获得更好的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Convolutional Neural Network applied in EEG imagined phoneme recognition system
Speech is a skill that most of the time we take for granted. In reality, this ability is a complex mechanism which requires thoughts to be translated into words who are further transposed into sounds, a mechanism which involves precise coordination of several muscles and joints. In some cases, this complex mechanism can no longer be performed and may be accompanied by almost complete loss of motor activity such in diseases as: stroke, Lock-Down syndrome, amyotrophic lateral sclerosis, cerebral palsy etc. The most recent method that aims to supplement the speech mechanism is imaginary speech recognition using electroencephalographic (EEG) signals, by using complex computing mechanisms like Deep Learning (DL) in order to decode the thoughts. In this paper we aim to recognize three types of clustered phonemes using conventional speech recognition techniques, like Mel-Cepstral Coefficients (MFCC) and Linear Predictive Coding (LPC) combined with Convolutional Neural Networks (CNN). We compared four types of features extraction: MFCC, LPC, MFCC+ LPC combined into 1-channel matrix and MFCC+ LPC combined into a 2-channel matrix. We showed that MFCC coefficients offer a better accuracy than LPC and that concatenating MFCC and LPC into a 2-channel matrix we obtain a better performance than combining them into 1-channel matrix.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信