X-squatter: AI Multilingual Generation of Cross-Language Sound-squatting

IF 2.8 4区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

ACM Transactions on Privacy and Security Pub Date : 2024-05-06 DOI:10.1145/3663569

Rodolfo Vieira Valentim, Idilio Drago, Marco Mellia, Federico Cerutti

{"title":"X-squatter: AI Multilingual Generation of Cross-Language Sound-squatting","authors":"Rodolfo Vieira Valentim, Idilio Drago, Marco Mellia, Federico Cerutti","doi":"10.1145/3663569","DOIUrl":null,"url":null,"abstract":"<p>Sound-squatting is a squatting technique that exploits similarities in word pronunciation to trick users into accessing malicious resources. It is an understudied threat that has gained traction with the popularity of smart speakers and audio-only content, such as podcasts. The picture gets even more complex when multiple languages are involved. We here introduce X-squatter, a multi- and cross-language AI-based system that relies on a Transformer Neural Network for generating high-quality sound-squatting candidates. We illustrate the use of X-squatter by searching for domain name squatting abuse across hundreds of millions of issued TLS certificates, alongside other squatting types. Key findings unveil that approximately 15% of generated sound-squatting candidates have associated TLS certificates, well above the prevalence of other squatting types (7%). Furthermore, we employ X-squatter to assess the potential for abuse in PyPI packages, revealing the existence of hundreds of candidates within a three-year package history. Notably, our results suggest that the current platform checks cannot handle sound-squatting attacks, calling for better countermeasures. We believe X-squatter uncovers the usage of multilingual sound-squatting phenomenon on the Internet and it is a crucial asset for proactive protection against the threat.</p>","PeriodicalId":56050,"journal":{"name":"ACM Transactions on Privacy and Security","volume":"47 1","pages":""},"PeriodicalIF":2.8000,"publicationDate":"2024-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Privacy and Security","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/3663569","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Sound-squatting is a squatting technique that exploits similarities in word pronunciation to trick users into accessing malicious resources. It is an understudied threat that has gained traction with the popularity of smart speakers and audio-only content, such as podcasts. The picture gets even more complex when multiple languages are involved. We here introduce X-squatter, a multi- and cross-language AI-based system that relies on a Transformer Neural Network for generating high-quality sound-squatting candidates. We illustrate the use of X-squatter by searching for domain name squatting abuse across hundreds of millions of issued TLS certificates, alongside other squatting types. Key findings unveil that approximately 15% of generated sound-squatting candidates have associated TLS certificates, well above the prevalence of other squatting types (7%). Furthermore, we employ X-squatter to assess the potential for abuse in PyPI packages, revealing the existence of hundreds of candidates within a three-year package history. Notably, our results suggest that the current platform checks cannot handle sound-squatting attacks, calling for better countermeasures. We believe X-squatter uncovers the usage of multilingual sound-squatting phenomenon on the Internet and it is a crucial asset for proactive protection against the threat.

查看原文本刊更多论文

X-squatter：人工智能多语种生成跨语言 "呷呷 "声

声音蹲守是一种利用单词发音相似性诱骗用户访问恶意资源的蹲守技术。这是一种未被充分研究的威胁，随着智能扬声器和纯音频内容（如播客）的普及，这种威胁的影响力越来越大。如果涉及多种语言，情况就会变得更加复杂。我们在此介绍 X-squatter，它是一种基于多语言和跨语言的人工智能系统，依靠变形神经网络生成高质量的声音侵扰候选者。我们通过在数以亿计的已签发 TLS 证书中搜索域名抢注滥用以及其他抢注类型来说明 X-squatter 的用途。主要研究结果表明，在生成的恶意抢注候选域名中，约有 15%具有相关的 TLS 证书，远高于其他抢注类型的发生率（7%）。此外，我们还利用 X-squatter 评估了 PyPI 软件包的滥用潜力，发现在三年的软件包历史中存在数百个候选软件。值得注意的是，我们的结果表明，当前的平台检查无法处理声音剽窃攻击，因此需要更好的应对措施。我们认为，X-squatter 发现了互联网上多语言声音抢注现象的使用情况，是主动防范这种威胁的重要资产。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Privacy and Security Computer Science-General Computer Science

CiteScore

5.20

自引率

0.00%

发文量

期刊介绍： ACM Transactions on Privacy and Security (TOPS) (formerly known as TISSEC) publishes high-quality research results in the fields of information and system security and privacy. Studies addressing all aspects of these fields are welcomed, ranging from technologies, to systems and applications, to the crafting of policies.