基于连续波雷达和优化AlexNet的静音语音接口

K. K. Mohd Shariff, Megat Zuhairy Megat Tajuddin, M. Younis, Megat Syahirul Amin Megat Ali
{"title":"基于连续波雷达和优化AlexNet的静音语音接口","authors":"K. K. Mohd Shariff, Megat Zuhairy Megat Tajuddin, M. Younis, Megat Syahirul Amin Megat Ali","doi":"10.1109/ISWTA55313.2022.9942770","DOIUrl":null,"url":null,"abstract":"Silent speech interface systems enables human-computer interaction via speech, in the absence of an acoustic signal. The applications range from speech recognition in noisy environment to interaction with speech-impaired individuals. This study proposes a state-of-the-art solution to silent speech interface based on continuous-wave radar. Six volunteers have participated in the study. They are required to silently utter four native Malay command words. A total of 1,180 samples have been obtained. The spectrograms of mouth movements from radar echo are constructed as profile of the silent commands. The feature images are used to develop a deep learning model by performing transfer learning on the AlexNet architecture. Different hyperparameter settings are evaluated. The best performance is obtained when the network is trained using Adam optimizer at with batch size of 512. The optimized model attained classification accuracies of 99.8% for training, and 98.3% for validation.","PeriodicalId":293957,"journal":{"name":"2022 IEEE Symposium on Wireless Technology & Applications (ISWTA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Silent Speech Interface using Continuous-Wave Radar and Optimized AlexNet\",\"authors\":\"K. K. Mohd Shariff, Megat Zuhairy Megat Tajuddin, M. Younis, Megat Syahirul Amin Megat Ali\",\"doi\":\"10.1109/ISWTA55313.2022.9942770\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Silent speech interface systems enables human-computer interaction via speech, in the absence of an acoustic signal. The applications range from speech recognition in noisy environment to interaction with speech-impaired individuals. This study proposes a state-of-the-art solution to silent speech interface based on continuous-wave radar. Six volunteers have participated in the study. They are required to silently utter four native Malay command words. A total of 1,180 samples have been obtained. The spectrograms of mouth movements from radar echo are constructed as profile of the silent commands. The feature images are used to develop a deep learning model by performing transfer learning on the AlexNet architecture. Different hyperparameter settings are evaluated. The best performance is obtained when the network is trained using Adam optimizer at with batch size of 512. The optimized model attained classification accuracies of 99.8% for training, and 98.3% for validation.\",\"PeriodicalId\":293957,\"journal\":{\"name\":\"2022 IEEE Symposium on Wireless Technology & Applications (ISWTA)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Symposium on Wireless Technology & Applications (ISWTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISWTA55313.2022.9942770\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Symposium on Wireless Technology & Applications (ISWTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISWTA55313.2022.9942770","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

无声语音接口系统在没有声音信号的情况下,通过语音实现人机交互。应用范围从嘈杂环境中的语音识别到与言语障碍个体的互动。本研究提出一种基于连续波雷达的无声语音接口解决方案。六名志愿者参加了这项研究。他们被要求默默地说出四个马来语的命令词。总共获得了1180个样本。基于雷达回波的口型运动谱图被构造为无声指令的剖面。通过在AlexNet架构上执行迁移学习,这些特征图像被用于开发深度学习模型。计算不同的超参数设置。在批大小为512的情况下,使用Adam优化器对网络进行训练,获得了最好的性能。优化后的模型训练准确率为99.8%,验证准确率为98.3%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Silent Speech Interface using Continuous-Wave Radar and Optimized AlexNet
Silent speech interface systems enables human-computer interaction via speech, in the absence of an acoustic signal. The applications range from speech recognition in noisy environment to interaction with speech-impaired individuals. This study proposes a state-of-the-art solution to silent speech interface based on continuous-wave radar. Six volunteers have participated in the study. They are required to silently utter four native Malay command words. A total of 1,180 samples have been obtained. The spectrograms of mouth movements from radar echo are constructed as profile of the silent commands. The feature images are used to develop a deep learning model by performing transfer learning on the AlexNet architecture. Different hyperparameter settings are evaluated. The best performance is obtained when the network is trained using Adam optimizer at with batch size of 512. The optimized model attained classification accuracies of 99.8% for training, and 98.3% for validation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信