{"title":"基于支持向量机和约束的新词识别","authors":"Xu Yuan-fang, Gu Hui","doi":"10.1109/ICISCE.2015.82","DOIUrl":null,"url":null,"abstract":"This paper studies a new method for identifying the new words, Objective to identify new words better. Method is first to extract the positive and negative samples from training corpus which was handled by segmentation and POS Tagging according to the dictionary, then combining with all kinds of words classification which was gotten from training corpus, and gaining the new word support vector through the training of supporting vector machine. Word segmentation and POS Tagging on the test of corpus containing simulated new words, in conjunction with the relevant constraints and the slack variables are proposed to select candidate new words, as to the quantized input and support vector machine classifier calculate by combining with the word itself characteristics, getting the relevant results is compared with a threshold and getting new words. As the results, the radial basis function (RBF) when the new word identification system recall rate and correct rate of the optimal. Conclusion is through this method can improve the accuracy of word recognition and recall.","PeriodicalId":356250,"journal":{"name":"2015 2nd International Conference on Information Science and Control Engineering","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"New Word Recognition Based on Support Vector Machines and Constraints\",\"authors\":\"Xu Yuan-fang, Gu Hui\",\"doi\":\"10.1109/ICISCE.2015.82\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper studies a new method for identifying the new words, Objective to identify new words better. Method is first to extract the positive and negative samples from training corpus which was handled by segmentation and POS Tagging according to the dictionary, then combining with all kinds of words classification which was gotten from training corpus, and gaining the new word support vector through the training of supporting vector machine. Word segmentation and POS Tagging on the test of corpus containing simulated new words, in conjunction with the relevant constraints and the slack variables are proposed to select candidate new words, as to the quantized input and support vector machine classifier calculate by combining with the word itself characteristics, getting the relevant results is compared with a threshold and getting new words. As the results, the radial basis function (RBF) when the new word identification system recall rate and correct rate of the optimal. Conclusion is through this method can improve the accuracy of word recognition and recall.\",\"PeriodicalId\":356250,\"journal\":{\"name\":\"2015 2nd International Conference on Information Science and Control Engineering\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-04-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 2nd International Conference on Information Science and Control Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICISCE.2015.82\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 2nd International Conference on Information Science and Control Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICISCE.2015.82","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
New Word Recognition Based on Support Vector Machines and Constraints
This paper studies a new method for identifying the new words, Objective to identify new words better. Method is first to extract the positive and negative samples from training corpus which was handled by segmentation and POS Tagging according to the dictionary, then combining with all kinds of words classification which was gotten from training corpus, and gaining the new word support vector through the training of supporting vector machine. Word segmentation and POS Tagging on the test of corpus containing simulated new words, in conjunction with the relevant constraints and the slack variables are proposed to select candidate new words, as to the quantized input and support vector machine classifier calculate by combining with the word itself characteristics, getting the relevant results is compared with a threshold and getting new words. As the results, the radial basis function (RBF) when the new word identification system recall rate and correct rate of the optimal. Conclusion is through this method can improve the accuracy of word recognition and recall.