Nadeen Ahmed, Jowaria Khan, Nouran Sheta, Rahma Tarek, I. Zualkernan, F. Aloul
{"title":"利用小型神经网络检测语音控制系统的重放攻击","authors":"Nadeen Ahmed, Jowaria Khan, Nouran Sheta, Rahma Tarek, I. Zualkernan, F. Aloul","doi":"10.1109/RTSI55261.2022.9905158","DOIUrl":null,"url":null,"abstract":"Voice-control is becoming a common interface for many consumer IoT systems. Common threats to such systems include impersonation, replay, speech synthesis, and voice conversion attacks. Of these attacks, replay is the easiest to implement where a command is recorded and replayed. This paper explores the development of a lightweight intrusion detection neural network based on a recent command voice replay dataset. A lightweight model based on 1D Convolutional Neural Networks (CNN), and Long Short-Term Memory (LSTM) was proposed. The trained model was compared with baseline models based on Gaussian Mixture Models (GMM) using Constant Q Cepstral Coefficients (CQCC) and Mel-Frequency Cepstral Coefficient (MFCC). The proposed model outperformed the GMM models, and its size was significantly lower making it more feasible for embedded systems implementation.","PeriodicalId":261718,"journal":{"name":"2022 IEEE 7th Forum on Research and Technologies for Society and Industry Innovation (RTSI)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Detecting Replay Attack on Voice-Controlled Systems using Small Neural Networks\",\"authors\":\"Nadeen Ahmed, Jowaria Khan, Nouran Sheta, Rahma Tarek, I. Zualkernan, F. Aloul\",\"doi\":\"10.1109/RTSI55261.2022.9905158\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Voice-control is becoming a common interface for many consumer IoT systems. Common threats to such systems include impersonation, replay, speech synthesis, and voice conversion attacks. Of these attacks, replay is the easiest to implement where a command is recorded and replayed. This paper explores the development of a lightweight intrusion detection neural network based on a recent command voice replay dataset. A lightweight model based on 1D Convolutional Neural Networks (CNN), and Long Short-Term Memory (LSTM) was proposed. The trained model was compared with baseline models based on Gaussian Mixture Models (GMM) using Constant Q Cepstral Coefficients (CQCC) and Mel-Frequency Cepstral Coefficient (MFCC). The proposed model outperformed the GMM models, and its size was significantly lower making it more feasible for embedded systems implementation.\",\"PeriodicalId\":261718,\"journal\":{\"name\":\"2022 IEEE 7th Forum on Research and Technologies for Society and Industry Innovation (RTSI)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 7th Forum on Research and Technologies for Society and Industry Innovation (RTSI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/RTSI55261.2022.9905158\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 7th Forum on Research and Technologies for Society and Industry Innovation (RTSI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RTSI55261.2022.9905158","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Detecting Replay Attack on Voice-Controlled Systems using Small Neural Networks
Voice-control is becoming a common interface for many consumer IoT systems. Common threats to such systems include impersonation, replay, speech synthesis, and voice conversion attacks. Of these attacks, replay is the easiest to implement where a command is recorded and replayed. This paper explores the development of a lightweight intrusion detection neural network based on a recent command voice replay dataset. A lightweight model based on 1D Convolutional Neural Networks (CNN), and Long Short-Term Memory (LSTM) was proposed. The trained model was compared with baseline models based on Gaussian Mixture Models (GMM) using Constant Q Cepstral Coefficients (CQCC) and Mel-Frequency Cepstral Coefficient (MFCC). The proposed model outperformed the GMM models, and its size was significantly lower making it more feasible for embedded systems implementation.