Di Pan, K. Yu, Xiaofei Wu, Binbin Wang, Yaowen Tan
{"title":"基于电信DPI数据URL信息的电子商务用户行为分类","authors":"Di Pan, K. Yu, Xiaofei Wu, Binbin Wang, Yaowen Tan","doi":"10.1109/ICCChina.2017.8330460","DOIUrl":null,"url":null,"abstract":"With the rapid development of mobile Internet, users turn to shopping online through e-commerce App. It is important to analyze user behavior such as browsing products, adding to cart, searching, and paying the bill. In this paper, we utilize the visiting information from DPI data of ISPs, and propose an e-commerce use behavior classification method only based on URL. In addition to N-gram features for URL, five schemes including Bi- and Tri-grams and combination words segmentation are proposed for feature extraction. Naive Bayesian, support vector machines, logistic regression, decision trees and random forests are used for multi-classification. Experimental results compare different feature extraction schemes with different models, which validate our proposed e-commerce user behavior classification method.","PeriodicalId":418396,"journal":{"name":"2017 IEEE/CIC International Conference on Communications in China (ICCC)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"E-commerce user behavior classification based on URL information from telecom DPI data\",\"authors\":\"Di Pan, K. Yu, Xiaofei Wu, Binbin Wang, Yaowen Tan\",\"doi\":\"10.1109/ICCChina.2017.8330460\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With the rapid development of mobile Internet, users turn to shopping online through e-commerce App. It is important to analyze user behavior such as browsing products, adding to cart, searching, and paying the bill. In this paper, we utilize the visiting information from DPI data of ISPs, and propose an e-commerce use behavior classification method only based on URL. In addition to N-gram features for URL, five schemes including Bi- and Tri-grams and combination words segmentation are proposed for feature extraction. Naive Bayesian, support vector machines, logistic regression, decision trees and random forests are used for multi-classification. Experimental results compare different feature extraction schemes with different models, which validate our proposed e-commerce user behavior classification method.\",\"PeriodicalId\":418396,\"journal\":{\"name\":\"2017 IEEE/CIC International Conference on Communications in China (ICCC)\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE/CIC International Conference on Communications in China (ICCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCChina.2017.8330460\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE/CIC International Conference on Communications in China (ICCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCChina.2017.8330460","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
E-commerce user behavior classification based on URL information from telecom DPI data
With the rapid development of mobile Internet, users turn to shopping online through e-commerce App. It is important to analyze user behavior such as browsing products, adding to cart, searching, and paying the bill. In this paper, we utilize the visiting information from DPI data of ISPs, and propose an e-commerce use behavior classification method only based on URL. In addition to N-gram features for URL, five schemes including Bi- and Tri-grams and combination words segmentation are proposed for feature extraction. Naive Bayesian, support vector machines, logistic regression, decision trees and random forests are used for multi-classification. Experimental results compare different feature extraction schemes with different models, which validate our proposed e-commerce user behavior classification method.