Boosting Homograph Attack Classification Using Ensemble Learning and N-gram Model

Tran Thao Phuong, Hoang-Quoc Nguyen-Son, R. Yamaguchi, Toshiyuki Nakata
{"title":"Boosting Homograph Attack Classification Using Ensemble Learning and N-gram Model","authors":"Tran Thao Phuong, Hoang-Quoc Nguyen-Son, R. Yamaguchi, Toshiyuki Nakata","doi":"10.1109/TrustCom50675.2020.00271","DOIUrl":null,"url":null,"abstract":"A visual homograph attack is a way that the attacker deceives the web users about which domain they are visiting by exploiting forged domains that look similar to the genuine domains. T. Thao et al. (IFIP SEC'19) proposed a homograph classification by applying conventional supervised learning algorithms on the features extracted from a single-character-based Structural Similarity Index (SSIM). This paper aims to improve the classification accuracy by combining their SSIM features with 199 features extracted from a N-gram model and applying advanced ensemble learning algorithms. The experimental result showed that our proposed method could enhance even 1.81% of accuracy and reduce 2.15% of false-positive rate. Furthermore, existing work applied machine learning on some features without being able to explain why applying it can improve the accuracy. Even though the accuracy could be improved, understanding the ground-truth is also crucial. Therefore, in this paper, we conducted an error empirical analysis and could obtain several findings behind our proposed approach.","PeriodicalId":221956,"journal":{"name":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TrustCom50675.2020.00271","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

A visual homograph attack is a way that the attacker deceives the web users about which domain they are visiting by exploiting forged domains that look similar to the genuine domains. T. Thao et al. (IFIP SEC'19) proposed a homograph classification by applying conventional supervised learning algorithms on the features extracted from a single-character-based Structural Similarity Index (SSIM). This paper aims to improve the classification accuracy by combining their SSIM features with 199 features extracted from a N-gram model and applying advanced ensemble learning algorithms. The experimental result showed that our proposed method could enhance even 1.81% of accuracy and reduce 2.15% of false-positive rate. Furthermore, existing work applied machine learning on some features without being able to explain why applying it can improve the accuracy. Even though the accuracy could be improved, understanding the ground-truth is also crucial. Therefore, in this paper, we conducted an error empirical analysis and could obtain several findings behind our proposed approach.
利用集成学习和n -图模型增强同形攻击分类
视觉同形图攻击是攻击者利用与真实域名相似的伪造域名欺骗网络用户正在访问的域名的一种方式。T. Thao等人(IFIP SEC'19)通过对从基于单个字符的结构相似指数(SSIM)中提取的特征应用传统的监督学习算法,提出了一种同形词分类方法。本文旨在将他们的SSIM特征与从N-gram模型中提取的199个特征相结合,并应用先进的集成学习算法来提高分类精度。实验结果表明,该方法可提高1.81%的准确率,降低2.15%的假阳性率。此外,现有的工作将机器学习应用于某些特征,但无法解释为什么应用它可以提高准确性。尽管准确性可以提高,但了解基本事实也至关重要。因此,在本文中,我们进行了误差实证分析,在我们提出的方法背后可以得到几个发现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信