基于数据流序列挖掘的用户行为异常检测方法

Yong Zhou, Yijie Wang, Xingkong Ma
{"title":"基于数据流序列挖掘的用户行为异常检测方法","authors":"Yong Zhou, Yijie Wang, Xingkong Ma","doi":"10.1109/PDCAT.2016.086","DOIUrl":null,"url":null,"abstract":"How to design a low-latency and accurate approach for user behavior anomaly detection over data streams has become a great challenge. However, existing studies cannot meet low-latency and accurate requirements, due to a large number of subsequences and sequential relationship in behaviors. This paper presents BADSM, a user behavior anomaly detection approach based on sequence mining over data streams that seeks to address such challenge. BADSM uses self-adaptive behavior pruning algorithm to adaptively divide data stream into behaviors and decrease the number of subsequences to improve the efficiency of sequence mining. Meanwhile, the top-k abnormal scoring algorithm is used to reduce the complexity of traversal and obtain quantitative detection result to improve accuracy. We design and implement a streaming anomaly detection system based on BADSM to perform online detection. Extensive experiments confirm that BADSM significantly reduces processing delay by at least 36.8% and false positive rate by 6.4% compared with the classic sequence mining approach PrefixSpan.","PeriodicalId":203925,"journal":{"name":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A User Behavior Anomaly Detection Approach Based on Sequence Mining over Data Streams\",\"authors\":\"Yong Zhou, Yijie Wang, Xingkong Ma\",\"doi\":\"10.1109/PDCAT.2016.086\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"How to design a low-latency and accurate approach for user behavior anomaly detection over data streams has become a great challenge. However, existing studies cannot meet low-latency and accurate requirements, due to a large number of subsequences and sequential relationship in behaviors. This paper presents BADSM, a user behavior anomaly detection approach based on sequence mining over data streams that seeks to address such challenge. BADSM uses self-adaptive behavior pruning algorithm to adaptively divide data stream into behaviors and decrease the number of subsequences to improve the efficiency of sequence mining. Meanwhile, the top-k abnormal scoring algorithm is used to reduce the complexity of traversal and obtain quantitative detection result to improve accuracy. We design and implement a streaming anomaly detection system based on BADSM to perform online detection. Extensive experiments confirm that BADSM significantly reduces processing delay by at least 36.8% and false positive rate by 6.4% compared with the classic sequence mining approach PrefixSpan.\",\"PeriodicalId\":203925,\"journal\":{\"name\":\"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDCAT.2016.086\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 17th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT.2016.086","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

如何设计一种低延迟、准确的数据流用户行为异常检测方法已成为一个巨大的挑战。然而,由于行为中存在大量的子序列和顺序关系,现有的研究无法满足低延迟和准确的要求。本文提出了BADSM,一种基于数据流序列挖掘的用户行为异常检测方法,旨在解决这一挑战。BADSM采用自适应行为修剪算法,自适应地将数据流划分为行为,减少子序列的数量,提高序列挖掘的效率。同时,采用top-k异常评分算法,降低遍历复杂度,获得定量检测结果,提高准确率。我们设计并实现了一个基于BADSM的流异常检测系统,实现在线检测。大量的实验证实,与经典的序列挖掘方法PrefixSpan相比,BADSM显著降低了处理延迟至少36.8%,假阳性率降低了6.4%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A User Behavior Anomaly Detection Approach Based on Sequence Mining over Data Streams
How to design a low-latency and accurate approach for user behavior anomaly detection over data streams has become a great challenge. However, existing studies cannot meet low-latency and accurate requirements, due to a large number of subsequences and sequential relationship in behaviors. This paper presents BADSM, a user behavior anomaly detection approach based on sequence mining over data streams that seeks to address such challenge. BADSM uses self-adaptive behavior pruning algorithm to adaptively divide data stream into behaviors and decrease the number of subsequences to improve the efficiency of sequence mining. Meanwhile, the top-k abnormal scoring algorithm is used to reduce the complexity of traversal and obtain quantitative detection result to improve accuracy. We design and implement a streaming anomaly detection system based on BADSM to perform online detection. Extensive experiments confirm that BADSM significantly reduces processing delay by at least 36.8% and false positive rate by 6.4% compared with the classic sequence mining approach PrefixSpan.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信