{"title":"基于支持向量机的中文网页分类算法性能分析","authors":"Xiao Gang, Jiancang Xie","doi":"10.1109/IAS.2009.316","DOIUrl":null,"url":null,"abstract":"Categorizing web automatically for users is a key technique of information society, and the key point of this technique is web training and categorization. This paper researches one of the important algorithm in this field—support vector machines (SVM). By analyzing and simulating 4 kinds of kernel function and 3 ways of feature selection, polynomial kernel function and document frequency is chosen for the best way in SVM algorithm. Meanwhile, pre-process algorithm is given in this paper in order to improve the efficiency of categorization. By simulation, importing pre-process method to SVM enhances the capability of the web categorization both in precision and time-consumption.","PeriodicalId":240354,"journal":{"name":"2009 Fifth International Conference on Information Assurance and Security","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Performance Analysis of Chinese Webpage Categorizing Algorithm Based on Support Vector Machines (SVM)\",\"authors\":\"Xiao Gang, Jiancang Xie\",\"doi\":\"10.1109/IAS.2009.316\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Categorizing web automatically for users is a key technique of information society, and the key point of this technique is web training and categorization. This paper researches one of the important algorithm in this field—support vector machines (SVM). By analyzing and simulating 4 kinds of kernel function and 3 ways of feature selection, polynomial kernel function and document frequency is chosen for the best way in SVM algorithm. Meanwhile, pre-process algorithm is given in this paper in order to improve the efficiency of categorization. By simulation, importing pre-process method to SVM enhances the capability of the web categorization both in precision and time-consumption.\",\"PeriodicalId\":240354,\"journal\":{\"name\":\"2009 Fifth International Conference on Information Assurance and Security\",\"volume\":\"59 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 Fifth International Conference on Information Assurance and Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IAS.2009.316\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Fifth International Conference on Information Assurance and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IAS.2009.316","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Performance Analysis of Chinese Webpage Categorizing Algorithm Based on Support Vector Machines (SVM)
Categorizing web automatically for users is a key technique of information society, and the key point of this technique is web training and categorization. This paper researches one of the important algorithm in this field—support vector machines (SVM). By analyzing and simulating 4 kinds of kernel function and 3 ways of feature selection, polynomial kernel function and document frequency is chosen for the best way in SVM algorithm. Meanwhile, pre-process algorithm is given in this paper in order to improve the efficiency of categorization. By simulation, importing pre-process method to SVM enhances the capability of the web categorization both in precision and time-consumption.