K. K. Mohd Shariff, Megat Zuhairy Megat Tajuddin, M. Younis, Megat Syahirul Amin Megat Ali
{"title":"基于连续波雷达和优化AlexNet的静音语音接口","authors":"K. K. Mohd Shariff, Megat Zuhairy Megat Tajuddin, M. Younis, Megat Syahirul Amin Megat Ali","doi":"10.1109/ISWTA55313.2022.9942770","DOIUrl":null,"url":null,"abstract":"Silent speech interface systems enables human-computer interaction via speech, in the absence of an acoustic signal. The applications range from speech recognition in noisy environment to interaction with speech-impaired individuals. This study proposes a state-of-the-art solution to silent speech interface based on continuous-wave radar. Six volunteers have participated in the study. They are required to silently utter four native Malay command words. A total of 1,180 samples have been obtained. The spectrograms of mouth movements from radar echo are constructed as profile of the silent commands. The feature images are used to develop a deep learning model by performing transfer learning on the AlexNet architecture. Different hyperparameter settings are evaluated. The best performance is obtained when the network is trained using Adam optimizer at with batch size of 512. The optimized model attained classification accuracies of 99.8% for training, and 98.3% for validation.","PeriodicalId":293957,"journal":{"name":"2022 IEEE Symposium on Wireless Technology & Applications (ISWTA)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Silent Speech Interface using Continuous-Wave Radar and Optimized AlexNet\",\"authors\":\"K. K. Mohd Shariff, Megat Zuhairy Megat Tajuddin, M. Younis, Megat Syahirul Amin Megat Ali\",\"doi\":\"10.1109/ISWTA55313.2022.9942770\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Silent speech interface systems enables human-computer interaction via speech, in the absence of an acoustic signal. The applications range from speech recognition in noisy environment to interaction with speech-impaired individuals. This study proposes a state-of-the-art solution to silent speech interface based on continuous-wave radar. Six volunteers have participated in the study. They are required to silently utter four native Malay command words. A total of 1,180 samples have been obtained. The spectrograms of mouth movements from radar echo are constructed as profile of the silent commands. The feature images are used to develop a deep learning model by performing transfer learning on the AlexNet architecture. Different hyperparameter settings are evaluated. The best performance is obtained when the network is trained using Adam optimizer at with batch size of 512. The optimized model attained classification accuracies of 99.8% for training, and 98.3% for validation.\",\"PeriodicalId\":293957,\"journal\":{\"name\":\"2022 IEEE Symposium on Wireless Technology & Applications (ISWTA)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Symposium on Wireless Technology & Applications (ISWTA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISWTA55313.2022.9942770\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Symposium on Wireless Technology & Applications (ISWTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISWTA55313.2022.9942770","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Silent Speech Interface using Continuous-Wave Radar and Optimized AlexNet
Silent speech interface systems enables human-computer interaction via speech, in the absence of an acoustic signal. The applications range from speech recognition in noisy environment to interaction with speech-impaired individuals. This study proposes a state-of-the-art solution to silent speech interface based on continuous-wave radar. Six volunteers have participated in the study. They are required to silently utter four native Malay command words. A total of 1,180 samples have been obtained. The spectrograms of mouth movements from radar echo are constructed as profile of the silent commands. The feature images are used to develop a deep learning model by performing transfer learning on the AlexNet architecture. Different hyperparameter settings are evaluated. The best performance is obtained when the network is trained using Adam optimizer at with batch size of 512. The optimized model attained classification accuracies of 99.8% for training, and 98.3% for validation.