Krsto Prorokovic, Michael Wand, Tanja Schultz, J. Schmidhuber
{"title":"Adaptation of an EMG-Based Speech Recognizer via Meta-Learning","authors":"Krsto Prorokovic, Michael Wand, Tanja Schultz, J. Schmidhuber","doi":"10.1109/GlobalSIP45357.2019.8969231","DOIUrl":null,"url":null,"abstract":"In nonacoustic speech recognition based on electromyography, i.e. on electrical muscle activity captured by noninvasive surface electrodes, differences between recording sessions are known to cause deteriorating system accuracy. Efficient adaptation of an existing system to an unseen recording session is therefore imperative for practical usage scenarios. We report on a meta-learning approach to pretrain a deep neural network frontend for a myoelectric speech recognizer in a way that it can be easily adapted to a new session. Fine-tuning this specially pretrained network yields lower Word Error Rates and higher frame accuracies than fine-tuning a conventionally pretrained network, without creating an increased computational burden on a possibly mobile device.","PeriodicalId":221378,"journal":{"name":"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","volume":"202 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Global Conference on Signal and Information Processing (GlobalSIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GlobalSIP45357.2019.8969231","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
In nonacoustic speech recognition based on electromyography, i.e. on electrical muscle activity captured by noninvasive surface electrodes, differences between recording sessions are known to cause deteriorating system accuracy. Efficient adaptation of an existing system to an unseen recording session is therefore imperative for practical usage scenarios. We report on a meta-learning approach to pretrain a deep neural network frontend for a myoelectric speech recognizer in a way that it can be easily adapted to a new session. Fine-tuning this specially pretrained network yields lower Word Error Rates and higher frame accuracies than fine-tuning a conventionally pretrained network, without creating an increased computational burden on a possibly mobile device.