基于连续波雷达和优化AlexNet的静音语音接口

2022 IEEE Symposium on Wireless Technology & Applications (ISWTA) Pub Date : 2022-08-17 DOI:10.1109/ISWTA55313.2022.9942770

K. K. Mohd Shariff, Megat Zuhairy Megat Tajuddin, M. Younis, Megat Syahirul Amin Megat Ali

{"title":"基于连续波雷达和优化AlexNet的静音语音接口","authors":"K. K. Mohd Shariff, Megat Zuhairy Megat Tajuddin, M. Younis, Megat Syahirul Amin Megat Ali","doi":"10.1109/ISWTA55313.2022.9942770","DOIUrl":null,"url":null,"abstract":"Silent speech interface systems enables human-computer interaction via speech, in the absence of an acoustic signal. The applications range from speech recognition in noisy environment to interaction with speech-impaired individuals. This study proposes a state-of-the-art solution to silent speech interface based on continuous-wave radar. Six volunteers have participated in the study. They are required to silently utter four native Malay command words. A total of 1,180 samples have been obtained. The spectrograms of mouth movements from radar echo are constructed as profile of the silent commands. The feature images are used to develop a deep learning model by performing transfer learning on the AlexNet architecture. Different hyperparameter settings are evaluated. The best performance is obtained when the network is trained using Adam optimizer at with batch size of 512. The optimized model attained classification accuracies of 99.8% for training, and 98.3% for validation.","PeriodicalId":293957,"journal":{"name":"2022 IEEE Symposium on Wireless Technology & Applications (ISWTA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Silent Speech Interface using Continuous-Wave Radar and Optimized AlexNet\",\"authors\":\"K. K. Mohd Shariff, Megat Zuhairy Megat Tajuddin, M. Younis, Megat Syahirul Amin Megat Ali\",\"doi\":\"10.1109/ISWTA55313.2022.9942770\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Silent speech interface systems enables human-computer interaction via speech, in the absence of an acoustic signal. The applications range from speech recognition in noisy environment to interaction with speech-impaired individuals. This study proposes a state-of-the-art solution to silent speech interface based on continuous-wave radar. Six volunteers have participated in the study. They are required to silently utter four native Malay command words. A total of 1,180 samples have been obtained. The spectrograms of mouth movements from radar echo are constructed as profile of the silent commands. The feature images are used to develop a deep learning model by performing transfer learning on the AlexNet architecture. Different hyperparameter settings are evaluated. The best performance is obtained when the network is trained using Adam optimizer at with batch size of 512. The optimized model attained classification accuracies of 99.8% for training, and 98.3% for validation.\",\"PeriodicalId\":293957,\"journal\":{\"name\":\"2022 IEEE Symposium on Wireless Technology & Applications (ISWTA)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Symposium on Wireless Technology & Applications (ISWTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISWTA55313.2022.9942770\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Symposium on Wireless Technology & Applications (ISWTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISWTA55313.2022.9942770","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

无声语音接口系统在没有声音信号的情况下，通过语音实现人机交互。应用范围从嘈杂环境中的语音识别到与言语障碍个体的互动。本研究提出一种基于连续波雷达的无声语音接口解决方案。六名志愿者参加了这项研究。他们被要求默默地说出四个马来语的命令词。总共获得了1180个样本。基于雷达回波的口型运动谱图被构造为无声指令的剖面。通过在AlexNet架构上执行迁移学习，这些特征图像被用于开发深度学习模型。计算不同的超参数设置。在批大小为512的情况下，使用Adam优化器对网络进行训练，获得了最好的性能。优化后的模型训练准确率为99.8%，验证准确率为98.3%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Silent Speech Interface using Continuous-Wave Radar and Optimized AlexNet

Silent speech interface systems enables human-computer interaction via speech, in the absence of an acoustic signal. The applications range from speech recognition in noisy environment to interaction with speech-impaired individuals. This study proposes a state-of-the-art solution to silent speech interface based on continuous-wave radar. Six volunteers have participated in the study. They are required to silently utter four native Malay command words. A total of 1,180 samples have been obtained. The spectrograms of mouth movements from radar echo are constructed as profile of the silent commands. The feature images are used to develop a deep learning model by performing transfer learning on the AlexNet architecture. Different hyperparameter settings are evaluated. The best performance is obtained when the network is trained using Adam optimizer at with batch size of 512. The optimized model attained classification accuracies of 99.8% for training, and 98.3% for validation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE Symposium on Wireless Technology & Applications (ISWTA)

自引率

0.00%

发文量