迷失在翻译中:基于ai的跨语言蹲音生成器

2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW) Pub Date : 2023-07-01 DOI:10.1109/EuroSPW59978.2023.00063

R. Valentim, I. Drago, M. Mellia, Federico Cerutti

{"title":"迷失在翻译中:基于ai的跨语言蹲音生成器","authors":"R. Valentim, I. Drago, M. Mellia, Federico Cerutti","doi":"10.1109/EuroSPW59978.2023.00063","DOIUrl":null,"url":null,"abstract":"Sound-squatting is a phishing attack that tricks users into accessing malicious resources by exploiting similarities in the pronunciation of words. It is an understudied threat that gains traction with the popularity of smart-speakers and the resurgence of content consumption exclusively via audio, such as podcasts. Defending against sound-squatting is complex, and existing solutions rely on manually curated lists of homophones, which limits the search to a few (and mostly existing) words only. We introduce Sound-squatter, a multi-language AI-based system that generates sound-squatting candidates for proactive defense that covers over 80% of exact homophones and further generating thousands of high-quality approximated homophones. Sound-squatter relies on a state-of-art Transformer Network to learn transliteration. We search for Sound-squatter generated cross-language sound-squatting domains over hundreds of millions of emitted TLS certificates comparing with other types of squatting candidates. Our finding reveals that around 6% of generated sound-squatting candidates have emitted TLS certificates, compared to 8% of other types of squatting candidates. We believe Sound-squatter uncovers the usage of multilingual sound-squatting phenomenon on the Internet and it is a crucial asset for proactive protection against sound-squatting.","PeriodicalId":220415,"journal":{"name":"2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Lost in Translation: AI-based Generator of Cross-Language Sound-squatting\",\"authors\":\"R. Valentim, I. Drago, M. Mellia, Federico Cerutti\",\"doi\":\"10.1109/EuroSPW59978.2023.00063\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sound-squatting is a phishing attack that tricks users into accessing malicious resources by exploiting similarities in the pronunciation of words. It is an understudied threat that gains traction with the popularity of smart-speakers and the resurgence of content consumption exclusively via audio, such as podcasts. Defending against sound-squatting is complex, and existing solutions rely on manually curated lists of homophones, which limits the search to a few (and mostly existing) words only. We introduce Sound-squatter, a multi-language AI-based system that generates sound-squatting candidates for proactive defense that covers over 80% of exact homophones and further generating thousands of high-quality approximated homophones. Sound-squatter relies on a state-of-art Transformer Network to learn transliteration. We search for Sound-squatter generated cross-language sound-squatting domains over hundreds of millions of emitted TLS certificates comparing with other types of squatting candidates. Our finding reveals that around 6% of generated sound-squatting candidates have emitted TLS certificates, compared to 8% of other types of squatting candidates. We believe Sound-squatter uncovers the usage of multilingual sound-squatting phenomenon on the Internet and it is a crucial asset for proactive protection against sound-squatting.\",\"PeriodicalId\":220415,\"journal\":{\"name\":\"2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)\",\"volume\":\"66 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EuroSPW59978.2023.00063\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EuroSPW59978.2023.00063","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

Sound-squatting是一种网络钓鱼攻击，通过利用单词发音的相似性来诱骗用户访问恶意资源。随着智能音箱的普及和专门通过音频(如播客)进行的内容消费的复苏，这是一个尚未得到充分研究的威胁。防止“抢音”是很复杂的，现有的解决方案依赖于人工管理的同音异义词列表，这将搜索限制在少数(而且大多数是现有的)单词上。我们介绍了一种基于多语言人工智能的系统，该系统可以生成用于主动防御的候选音蹲，覆盖80%以上的精确同音异义字，并进一步生成数千个高质量的近似同音异义字。声音霸占者依靠最先进的变压器网络来学习音译。我们在数以亿计的发出的TLS证书中，与其他类型的抢注候选证书进行比较，搜索由抢注生成的跨语言抢注域名。我们的研究结果显示，大约6%的生成的语音抢注候选对象发出了TLS证书，而其他类型的抢注候选对象的这一比例为8%。我们认为，“拼音”揭示了互联网上多语言拼音现象的使用，这是主动防止拼音的重要资产。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Lost in Translation: AI-based Generator of Cross-Language Sound-squatting

Sound-squatting is a phishing attack that tricks users into accessing malicious resources by exploiting similarities in the pronunciation of words. It is an understudied threat that gains traction with the popularity of smart-speakers and the resurgence of content consumption exclusively via audio, such as podcasts. Defending against sound-squatting is complex, and existing solutions rely on manually curated lists of homophones, which limits the search to a few (and mostly existing) words only. We introduce Sound-squatter, a multi-language AI-based system that generates sound-squatting candidates for proactive defense that covers over 80% of exact homophones and further generating thousands of high-quality approximated homophones. Sound-squatter relies on a state-of-art Transformer Network to learn transliteration. We search for Sound-squatter generated cross-language sound-squatting domains over hundreds of millions of emitted TLS certificates comparing with other types of squatting candidates. Our finding reveals that around 6% of generated sound-squatting candidates have emitted TLS certificates, compared to 8% of other types of squatting candidates. We believe Sound-squatter uncovers the usage of multilingual sound-squatting phenomenon on the Internet and it is a crucial asset for proactive protection against sound-squatting.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 IEEE European Symposium on Security and Privacy Workshops (EuroS&PW)

自引率

0.00%

发文量