Fingerprinting encrypted voice traffic on smart speakers with deep learning

Proceedings of the 13th ACM Conference on Security and Privacy in Wireless and Mobile Networks Pub Date : 2020-05-20 DOI:10.1145/3395351.3399357

Chenggang Wang, Sean Kennedy, Haipeng Li, King Hudson, G. Atluri, Xuetao Wei, Wenhai Sun, Boyang Wang

{"title":"Fingerprinting encrypted voice traffic on smart speakers with deep learning","authors":"Chenggang Wang, Sean Kennedy, Haipeng Li, King Hudson, G. Atluri, Xuetao Wei, Wenhai Sun, Boyang Wang","doi":"10.1145/3395351.3399357","DOIUrl":null,"url":null,"abstract":"This paper investigates the privacy leakage of smart speakers under an encrypted traffic analysis attack, referred to as voice command fingerprinting. In this attack, an adversary can eavesdrop both outgoing and incoming encrypted voice traffic of a smart speaker, and infers which voice command a user says over encrypted traffic. We first built an automatic voice traffic collection tool and collected two large-scale datasets on two smart speakers, Amazon Echo and Google Home. Then, we implemented proof-of-concept attacks by leveraging deep learning. Our experimental results over the two datasets indicate disturbing privacy concerns. Specifically, compared to 1% accuracy with random guess, our attacks can correctly infer voice commands over encrypted traffic with 92.89% accuracy on Amazon Echo. Despite variances that human voices may cause on outgoing traffic, our proof-of-concept attacks remain effective even only leveraging incoming traffic (i.e., the traffic from the server). This is because the AI-based voice services running on the server side response commands in the same voice and with a deterministic or predictable manner in text, which leave distinguishable pattern over encrypted traffic. We also built a proof-of-concept defense to obfuscate encrypted traffic. Our results show that the defense can effectively mitigate attack accuracy on Amazon Echo to 32.18%.","PeriodicalId":165929,"journal":{"name":"Proceedings of the 13th ACM Conference on Security and Privacy in Wireless and Mobile Networks","volume":"384 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 13th ACM Conference on Security and Privacy in Wireless and Mobile Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3395351.3399357","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 32

Abstract

This paper investigates the privacy leakage of smart speakers under an encrypted traffic analysis attack, referred to as voice command fingerprinting. In this attack, an adversary can eavesdrop both outgoing and incoming encrypted voice traffic of a smart speaker, and infers which voice command a user says over encrypted traffic. We first built an automatic voice traffic collection tool and collected two large-scale datasets on two smart speakers, Amazon Echo and Google Home. Then, we implemented proof-of-concept attacks by leveraging deep learning. Our experimental results over the two datasets indicate disturbing privacy concerns. Specifically, compared to 1% accuracy with random guess, our attacks can correctly infer voice commands over encrypted traffic with 92.89% accuracy on Amazon Echo. Despite variances that human voices may cause on outgoing traffic, our proof-of-concept attacks remain effective even only leveraging incoming traffic (i.e., the traffic from the server). This is because the AI-based voice services running on the server side response commands in the same voice and with a deterministic or predictable manner in text, which leave distinguishable pattern over encrypted traffic. We also built a proof-of-concept defense to obfuscate encrypted traffic. Our results show that the defense can effectively mitigate attack accuracy on Amazon Echo to 32.18%.

查看原文本刊更多论文

利用深度学习技术对智能扬声器上的语音流量进行指纹加密

本文研究了一种被称为语音指令指纹的加密流量分析攻击下智能音箱的隐私泄露问题。在这种攻击中，攻击者可以窃听智能扬声器的传出和传入的加密语音流量，并推断用户在加密流量上发出的语音命令。我们首先构建了一个语音流量自动采集工具，在Amazon Echo和Google Home两个智能音箱上采集了两个大规模的数据集。然后，我们利用深度学习实现了概念验证攻击。我们在两个数据集上的实验结果表明了令人不安的隐私问题。具体来说，与随机猜测的1%准确率相比，我们的攻击在亚马逊Echo上通过加密流量正确推断语音命令的准确率为92.89%。尽管人的声音可能会对输出流量造成影响，但我们的概念验证攻击即使只利用传入流量(即来自服务器的流量)仍然有效。这是因为在服务器端运行的基于ai的语音服务以相同的语音响应命令，并以确定或可预测的方式在文本中响应命令，从而在加密流量中留下可区分的模式。我们还建立了一个概念验证防御来混淆加密流量。结果表明，该防御可以有效地将亚马逊Echo上的攻击准确率降低到32.18%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 13th ACM Conference on Security and Privacy in Wireless and Mobile Networks

自引率

0.00%

发文量