挖掘Web日志的矩阵降维

Jianjiang Lu, Baowen Xu, Hongji Yang
{"title":"挖掘Web日志的矩阵降维","authors":"Jianjiang Lu, Baowen Xu, Hongji Yang","doi":"10.1109/WI.2003.1241222","DOIUrl":null,"url":null,"abstract":"Web-based logs contain potentially useful data with which designers can assess the usability and effectiveness of their choices. Clustering techniques have been used to automatically discover typical user profiles from Web access logs recently. But it is a challenging problem to design effective similarity measure between the session vectors, which are usually high dimensional and sparse. Nonnegative matrix factorisation approaches are applied to dimensionality reduction of the session-URL matrix, and the spherical k-means algorithm is used to partition the projecting vectors of the user session vectors into several clusters. Two methods for discovering typical user session profiles from the clusters are presented last. The results of experiment show that our algorithms can mine interesting user profiles effectively.","PeriodicalId":403574,"journal":{"name":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","volume":"66-69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Matrix dimensionality reduction for mining Web logs\",\"authors\":\"Jianjiang Lu, Baowen Xu, Hongji Yang\",\"doi\":\"10.1109/WI.2003.1241222\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Web-based logs contain potentially useful data with which designers can assess the usability and effectiveness of their choices. Clustering techniques have been used to automatically discover typical user profiles from Web access logs recently. But it is a challenging problem to design effective similarity measure between the session vectors, which are usually high dimensional and sparse. Nonnegative matrix factorisation approaches are applied to dimensionality reduction of the session-URL matrix, and the spherical k-means algorithm is used to partition the projecting vectors of the user session vectors into several clusters. Two methods for discovering typical user session profiles from the clusters are presented last. The results of experiment show that our algorithms can mine interesting user profiles effectively.\",\"PeriodicalId\":403574,\"journal\":{\"name\":\"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)\",\"volume\":\"66-69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WI.2003.1241222\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI.2003.1241222","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

摘要

基于web的日志包含潜在的有用数据,设计人员可以使用这些数据评估其选择的可用性和有效性。最近,集群技术被用于从Web访问日志中自动发现典型的用户配置文件。但会话向量通常是高维且稀疏的,如何设计有效的相似度度量是一个具有挑战性的问题。将非负矩阵分解方法应用于session- url矩阵的降维,并使用球面k-means算法将用户会话向量的投影向量划分为多个聚类。最后给出了从集群中发现典型用户会话配置文件的两种方法。实验结果表明,我们的算法可以有效地挖掘出有趣的用户特征。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Matrix dimensionality reduction for mining Web logs
Web-based logs contain potentially useful data with which designers can assess the usability and effectiveness of their choices. Clustering techniques have been used to automatically discover typical user profiles from Web access logs recently. But it is a challenging problem to design effective similarity measure between the session vectors, which are usually high dimensional and sparse. Nonnegative matrix factorisation approaches are applied to dimensionality reduction of the session-URL matrix, and the spherical k-means algorithm is used to partition the projecting vectors of the user session vectors into several clusters. Two methods for discovering typical user session profiles from the clusters are presented last. The results of experiment show that our algorithms can mine interesting user profiles effectively.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信