{"title":"挖掘Web日志的矩阵降维","authors":"Jianjiang Lu, Baowen Xu, Hongji Yang","doi":"10.1109/WI.2003.1241222","DOIUrl":null,"url":null,"abstract":"Web-based logs contain potentially useful data with which designers can assess the usability and effectiveness of their choices. Clustering techniques have been used to automatically discover typical user profiles from Web access logs recently. But it is a challenging problem to design effective similarity measure between the session vectors, which are usually high dimensional and sparse. Nonnegative matrix factorisation approaches are applied to dimensionality reduction of the session-URL matrix, and the spherical k-means algorithm is used to partition the projecting vectors of the user session vectors into several clusters. Two methods for discovering typical user session profiles from the clusters are presented last. The results of experiment show that our algorithms can mine interesting user profiles effectively.","PeriodicalId":403574,"journal":{"name":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","volume":"66-69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-10-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Matrix dimensionality reduction for mining Web logs\",\"authors\":\"Jianjiang Lu, Baowen Xu, Hongji Yang\",\"doi\":\"10.1109/WI.2003.1241222\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Web-based logs contain potentially useful data with which designers can assess the usability and effectiveness of their choices. Clustering techniques have been used to automatically discover typical user profiles from Web access logs recently. But it is a challenging problem to design effective similarity measure between the session vectors, which are usually high dimensional and sparse. Nonnegative matrix factorisation approaches are applied to dimensionality reduction of the session-URL matrix, and the spherical k-means algorithm is used to partition the projecting vectors of the user session vectors into several clusters. Two methods for discovering typical user session profiles from the clusters are presented last. The results of experiment show that our algorithms can mine interesting user profiles effectively.\",\"PeriodicalId\":403574,\"journal\":{\"name\":\"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)\",\"volume\":\"66-69 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-10-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WI.2003.1241222\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings IEEE/WIC International Conference on Web Intelligence (WI 2003)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WI.2003.1241222","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Matrix dimensionality reduction for mining Web logs
Web-based logs contain potentially useful data with which designers can assess the usability and effectiveness of their choices. Clustering techniques have been used to automatically discover typical user profiles from Web access logs recently. But it is a challenging problem to design effective similarity measure between the session vectors, which are usually high dimensional and sparse. Nonnegative matrix factorisation approaches are applied to dimensionality reduction of the session-URL matrix, and the spherical k-means algorithm is used to partition the projecting vectors of the user session vectors into several clusters. Two methods for discovering typical user session profiles from the clusters are presented last. The results of experiment show that our algorithms can mine interesting user profiles effectively.