基于稀疏非负矩阵分解的网络行为分析

Akihiro Demachi, Shin Matsushima, K. Yamanishi
{"title":"基于稀疏非负矩阵分解的网络行为分析","authors":"Akihiro Demachi, Shin Matsushima, K. Yamanishi","doi":"10.1109/DSAA.2016.85","DOIUrl":null,"url":null,"abstract":"We are concerned with the issue of discovering behavioral patterns on the web. When a large amount of web access logs are given, we are interested in how they are categorized and how they are related to activities in real life. In order to conduct that analysis, we develop a novel algorithm for sparse non-negative matrix factorization (SNMF), which can discover patterns of web behaviors. Although there exist a number of variants of SNMFs, our algorithm is novel in that it updates parameters in a multiplicative way with performance guaranteed, thereby works more robustly than existing ones, even when the rank of factorized matrices is large. We demonstrate the effectiveness of our algorithm using artificial data sets. We then apply our algorithm into a large scale web log data obtained from 70,000 monitors to discover meaningful relations among web behavioral patterns and real life activities. We employ the information-theoretic measure to demonstrate that our algorithm is able to extract more significant relations among web behavior patterns and real life activities than competitive methods.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Web Behavior Analysis Using Sparse Non-Negative Matrix Factorization\",\"authors\":\"Akihiro Demachi, Shin Matsushima, K. Yamanishi\",\"doi\":\"10.1109/DSAA.2016.85\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We are concerned with the issue of discovering behavioral patterns on the web. When a large amount of web access logs are given, we are interested in how they are categorized and how they are related to activities in real life. In order to conduct that analysis, we develop a novel algorithm for sparse non-negative matrix factorization (SNMF), which can discover patterns of web behaviors. Although there exist a number of variants of SNMFs, our algorithm is novel in that it updates parameters in a multiplicative way with performance guaranteed, thereby works more robustly than existing ones, even when the rank of factorized matrices is large. We demonstrate the effectiveness of our algorithm using artificial data sets. We then apply our algorithm into a large scale web log data obtained from 70,000 monitors to discover meaningful relations among web behavioral patterns and real life activities. We employ the information-theoretic measure to demonstrate that our algorithm is able to extract more significant relations among web behavior patterns and real life activities than competitive methods.\",\"PeriodicalId\":193885,\"journal\":{\"name\":\"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSAA.2016.85\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSAA.2016.85","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

我们关心的问题是发现网络上的行为模式。当提供大量的网络访问日志时,我们感兴趣的是如何对它们进行分类,以及它们如何与现实生活中的活动相关联。为了进行这种分析,我们开发了一种新的稀疏非负矩阵分解(SNMF)算法,该算法可以发现网络行为的模式。尽管存在许多snmf变体,但我们的算法是新颖的,因为它以保证性能的乘法方式更新参数,因此即使在分解矩阵的秩很大的情况下,也比现有的算法更健壮。我们用人工数据集证明了算法的有效性。然后,我们将我们的算法应用于从70,000个监视器中获得的大规模网络日志数据,以发现网络行为模式与现实生活活动之间的有意义的关系。我们采用信息论的方法来证明,我们的算法能够提取网络行为模式和现实生活活动之间比竞争方法更重要的关系。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Web Behavior Analysis Using Sparse Non-Negative Matrix Factorization
We are concerned with the issue of discovering behavioral patterns on the web. When a large amount of web access logs are given, we are interested in how they are categorized and how they are related to activities in real life. In order to conduct that analysis, we develop a novel algorithm for sparse non-negative matrix factorization (SNMF), which can discover patterns of web behaviors. Although there exist a number of variants of SNMFs, our algorithm is novel in that it updates parameters in a multiplicative way with performance guaranteed, thereby works more robustly than existing ones, even when the rank of factorized matrices is large. We demonstrate the effectiveness of our algorithm using artificial data sets. We then apply our algorithm into a large scale web log data obtained from 70,000 monitors to discover meaningful relations among web behavioral patterns and real life activities. We employ the information-theoretic measure to demonstrate that our algorithm is able to extract more significant relations among web behavior patterns and real life activities than competitive methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信