基于加权特征线嵌入的钓鱼网站检测

M. Imani, G. Montazer
{"title":"基于加权特征线嵌入的钓鱼网站检测","authors":"M. Imani, G. Montazer","doi":"10.22042/ISECURE.2017.83439.377","DOIUrl":null,"url":null,"abstract":"The aim of phishing is tracing the users’ s private information without their permission by designing a new website which mimics the trusted website. The specialists of information technology do not agree on a unique definition for the discriminative features that characterizes the phishing websites. Therefore, the number of reliable training samples in phishing detection problems is limited. Moreover, among the available training samples, there are abnormal samples that cause classification error. For instance, it is possible that there are phishing samples with similar features to legitimate ones and vice versa. A supervised feature extraction method, called weighted feature line embedding, is proposed in this paper to solve these problems. The proposed method virtually generates training samples by utilizing the feature line metric. Hence, it can solve the small sample size problem. Moreover, by assigning appropriate weights to each pair of feature points, it corrects the undesirable quality of abnormal samples. The features extracted by our method improve the performance of phishing website detection specially by using small training","PeriodicalId":436674,"journal":{"name":"ISC Int. J. Inf. Secur.","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Phishing website detection using weighted feature line embedding\",\"authors\":\"M. Imani, G. Montazer\",\"doi\":\"10.22042/ISECURE.2017.83439.377\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The aim of phishing is tracing the users’ s private information without their permission by designing a new website which mimics the trusted website. The specialists of information technology do not agree on a unique definition for the discriminative features that characterizes the phishing websites. Therefore, the number of reliable training samples in phishing detection problems is limited. Moreover, among the available training samples, there are abnormal samples that cause classification error. For instance, it is possible that there are phishing samples with similar features to legitimate ones and vice versa. A supervised feature extraction method, called weighted feature line embedding, is proposed in this paper to solve these problems. The proposed method virtually generates training samples by utilizing the feature line metric. Hence, it can solve the small sample size problem. Moreover, by assigning appropriate weights to each pair of feature points, it corrects the undesirable quality of abnormal samples. The features extracted by our method improve the performance of phishing website detection specially by using small training\",\"PeriodicalId\":436674,\"journal\":{\"name\":\"ISC Int. J. Inf. Secur.\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ISC Int. J. Inf. Secur.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22042/ISECURE.2017.83439.377\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ISC Int. J. Inf. Secur.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22042/ISECURE.2017.83439.377","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

网络钓鱼的目的是在未经用户允许的情况下,通过设计一个模仿可信网站的新网站来追踪用户的私人信息。信息技术专家对网络钓鱼网站的鉴别特征没有统一的定义。因此,网络钓鱼检测问题中可靠的训练样本数量是有限的。此外,在可用的训练样本中,存在导致分类误差的异常样本。例如,有可能存在与合法样本具有相似特征的网络钓鱼样本,反之亦然。为了解决这些问题,本文提出了一种有监督的特征提取方法——加权特征线嵌入。该方法利用特征线度量虚拟生成训练样本。因此,它可以解决小样本量问题。此外,通过对每对特征点分配适当的权重,它纠正了异常样本的不良质量。该方法提取的特征通过小训练提高了网络钓鱼网站的检测性能
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Phishing website detection using weighted feature line embedding
The aim of phishing is tracing the users’ s private information without their permission by designing a new website which mimics the trusted website. The specialists of information technology do not agree on a unique definition for the discriminative features that characterizes the phishing websites. Therefore, the number of reliable training samples in phishing detection problems is limited. Moreover, among the available training samples, there are abnormal samples that cause classification error. For instance, it is possible that there are phishing samples with similar features to legitimate ones and vice versa. A supervised feature extraction method, called weighted feature line embedding, is proposed in this paper to solve these problems. The proposed method virtually generates training samples by utilizing the feature line metric. Hence, it can solve the small sample size problem. Moreover, by assigning appropriate weights to each pair of feature points, it corrects the undesirable quality of abnormal samples. The features extracted by our method improve the performance of phishing website detection specially by using small training
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信