{"title":"使用HTTP流量的网站推荐","authors":"Ming Jia, Shaozhi Ye, Xing Li, J. Dickerson","doi":"10.1109/ICDM.2007.44","DOIUrl":null,"url":null,"abstract":"Collaborative Filtering (CF) is widely used in web recommender systems, while most existing CF applications focus on transactions or page views within a single site. In this paper, we build a recommender system prototype, which suggests web sites to users, by collecting browsing events at routers without neither user nor website effort. 100 million HTTP flows, involving 11, 327 websites, are converted to user-site ratings using access frequency as the implicit rating metric. With this rating dataset, we evaluate six CF algorithms including one proposed algorithm based on IP address locality. Our experiments show that the recommendation from K nearest neighbors (Runn) performs the best by 50% p@10 (precision of top 10) and 53% p@5 (precision of top 5). Although the precision is far from ideal, our preliminary results suggest the potential value of such a centralized web site recommender system.","PeriodicalId":233758,"journal":{"name":"Seventh IEEE International Conference on Data Mining (ICDM 2007)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Web Site Recommendation Using HTTP Traffic\",\"authors\":\"Ming Jia, Shaozhi Ye, Xing Li, J. Dickerson\",\"doi\":\"10.1109/ICDM.2007.44\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Collaborative Filtering (CF) is widely used in web recommender systems, while most existing CF applications focus on transactions or page views within a single site. In this paper, we build a recommender system prototype, which suggests web sites to users, by collecting browsing events at routers without neither user nor website effort. 100 million HTTP flows, involving 11, 327 websites, are converted to user-site ratings using access frequency as the implicit rating metric. With this rating dataset, we evaluate six CF algorithms including one proposed algorithm based on IP address locality. Our experiments show that the recommendation from K nearest neighbors (Runn) performs the best by 50% p@10 (precision of top 10) and 53% p@5 (precision of top 5). Although the precision is far from ideal, our preliminary results suggest the potential value of such a centralized web site recommender system.\",\"PeriodicalId\":233758,\"journal\":{\"name\":\"Seventh IEEE International Conference on Data Mining (ICDM 2007)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Seventh IEEE International Conference on Data Mining (ICDM 2007)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM.2007.44\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seventh IEEE International Conference on Data Mining (ICDM 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2007.44","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Collaborative Filtering (CF) is widely used in web recommender systems, while most existing CF applications focus on transactions or page views within a single site. In this paper, we build a recommender system prototype, which suggests web sites to users, by collecting browsing events at routers without neither user nor website effort. 100 million HTTP flows, involving 11, 327 websites, are converted to user-site ratings using access frequency as the implicit rating metric. With this rating dataset, we evaluate six CF algorithms including one proposed algorithm based on IP address locality. Our experiments show that the recommendation from K nearest neighbors (Runn) performs the best by 50% p@10 (precision of top 10) and 53% p@5 (precision of top 5). Although the precision is far from ideal, our preliminary results suggest the potential value of such a centralized web site recommender system.