{"title":"基于统计模型的汉语分词比较研究","authors":"Meng Wenchao, Liu Lianchen, Chen Anyan","doi":"10.1109/ICSESS.2010.5552323","DOIUrl":null,"url":null,"abstract":"Recent years, character based approaches to Chinese word segmentation task are developed, which show great success. In this paper, a detailed comparison among different statistical models are done. Three models (HMM, MEMM and CRF) are considered. First different tag sets are chosen to evaluate the models' precision and efficiency. Then HMM and MEMM are compared with the similar features. At last different features are compared to measure which feature contributes most to Chinese word segmentation. Finally some suggestion is given for developing Chinese word segmentation systems.","PeriodicalId":264630,"journal":{"name":"2010 IEEE International Conference on Software Engineering and Service Sciences","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A comparative study on Chinese word segmentation using statistical models\",\"authors\":\"Meng Wenchao, Liu Lianchen, Chen Anyan\",\"doi\":\"10.1109/ICSESS.2010.5552323\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent years, character based approaches to Chinese word segmentation task are developed, which show great success. In this paper, a detailed comparison among different statistical models are done. Three models (HMM, MEMM and CRF) are considered. First different tag sets are chosen to evaluate the models' precision and efficiency. Then HMM and MEMM are compared with the similar features. At last different features are compared to measure which feature contributes most to Chinese word segmentation. Finally some suggestion is given for developing Chinese word segmentation systems.\",\"PeriodicalId\":264630,\"journal\":{\"name\":\"2010 IEEE International Conference on Software Engineering and Service Sciences\",\"volume\":\"3 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE International Conference on Software Engineering and Service Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSESS.2010.5552323\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Software Engineering and Service Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSESS.2010.5552323","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A comparative study on Chinese word segmentation using statistical models
Recent years, character based approaches to Chinese word segmentation task are developed, which show great success. In this paper, a detailed comparison among different statistical models are done. Three models (HMM, MEMM and CRF) are considered. First different tag sets are chosen to evaluate the models' precision and efficiency. Then HMM and MEMM are compared with the similar features. At last different features are compared to measure which feature contributes most to Chinese word segmentation. Finally some suggestion is given for developing Chinese word segmentation systems.