{"title":"基于新型领先用户识别指标体系和k均值聚类的UGC平台用户分割方法","authors":"D. Chang, Jing Zhao, F. Zou, Gangyan Xu","doi":"10.1109/IEEM45057.2020.9309940","DOIUrl":null,"url":null,"abstract":"Nowadays, user-generated content (UGC) has become an important part of Internet user data. This study aims to develop an innovative user identification approach based on UGC platforms. To achieve the objective, this research proposed i) a web mining process to crawl UGC data; ii) a lead user identification index system for evaluating the innovation capability of users; and iii) a user classification process based on K-means clustering according to their UGC performance. Particularly, the complete user performance data of more than 100 users on Douban (one of the biggest UGC platforms in China) were collected, and the web mining, factor analysis, and clustering algorithm was integrated to process the data and classify user groups according to their UGC performance. The classification results were verified through incorporating expertise, and it showed that the classification can exactly recognize the users with proper lead userness. This research is expected to help small and medium enterprises without powerful big data ability to identify innovative users and valuable UGC data more efficiently and facilitate the further product improvement.","PeriodicalId":226426,"journal":{"name":"2020 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM)","volume":"282 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A User Segmentation Approach for UGC Platform Based on a New Lead User Identification Index System and K-means Clustering\",\"authors\":\"D. Chang, Jing Zhao, F. Zou, Gangyan Xu\",\"doi\":\"10.1109/IEEM45057.2020.9309940\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays, user-generated content (UGC) has become an important part of Internet user data. This study aims to develop an innovative user identification approach based on UGC platforms. To achieve the objective, this research proposed i) a web mining process to crawl UGC data; ii) a lead user identification index system for evaluating the innovation capability of users; and iii) a user classification process based on K-means clustering according to their UGC performance. Particularly, the complete user performance data of more than 100 users on Douban (one of the biggest UGC platforms in China) were collected, and the web mining, factor analysis, and clustering algorithm was integrated to process the data and classify user groups according to their UGC performance. The classification results were verified through incorporating expertise, and it showed that the classification can exactly recognize the users with proper lead userness. This research is expected to help small and medium enterprises without powerful big data ability to identify innovative users and valuable UGC data more efficiently and facilitate the further product improvement.\",\"PeriodicalId\":226426,\"journal\":{\"name\":\"2020 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM)\",\"volume\":\"282 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IEEM45057.2020.9309940\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEEM45057.2020.9309940","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A User Segmentation Approach for UGC Platform Based on a New Lead User Identification Index System and K-means Clustering
Nowadays, user-generated content (UGC) has become an important part of Internet user data. This study aims to develop an innovative user identification approach based on UGC platforms. To achieve the objective, this research proposed i) a web mining process to crawl UGC data; ii) a lead user identification index system for evaluating the innovation capability of users; and iii) a user classification process based on K-means clustering according to their UGC performance. Particularly, the complete user performance data of more than 100 users on Douban (one of the biggest UGC platforms in China) were collected, and the web mining, factor analysis, and clustering algorithm was integrated to process the data and classify user groups according to their UGC performance. The classification results were verified through incorporating expertise, and it showed that the classification can exactly recognize the users with proper lead userness. This research is expected to help small and medium enterprises without powerful big data ability to identify innovative users and valuable UGC data more efficiently and facilitate the further product improvement.