{"title":"Research on the System of Jointing Chinese Word Segmentation with Part-of-Speech Tagging","authors":"Qin Li, Wei Wei","doi":"10.1109/ISCID.2013.103","DOIUrl":null,"url":null,"abstract":"In this paper, we construct a system integrating Chinese word segmentation with part-of-speech tagging, by an approach based dictionary and statistics. In the early stage, many nodes are roughly segmented through searching word dictionary and used to generate possible paths as candidates, instead of choosing N-shortest paths. In the next stage, each path generated above has a cost, which is calculated by a statistical method. With improving the precision of combinational ambiguity, the optimum path that has lowest cost is chosen as the final result. The preliminary experiments show that the segmentation precision of the joint system based on hybrid approach is 94.06%, POS tagging precision 90.96%, and the recall and F-measure range from 96.86% to 95.44.0% and from 93.67% to 92.29% respectively. The Work of improving the performance of the system is still ongoing.","PeriodicalId":297027,"journal":{"name":"2013 Sixth International Symposium on Computational Intelligence and Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 Sixth International Symposium on Computational Intelligence and Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCID.2013.103","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
In this paper, we construct a system integrating Chinese word segmentation with part-of-speech tagging, by an approach based dictionary and statistics. In the early stage, many nodes are roughly segmented through searching word dictionary and used to generate possible paths as candidates, instead of choosing N-shortest paths. In the next stage, each path generated above has a cost, which is calculated by a statistical method. With improving the precision of combinational ambiguity, the optimum path that has lowest cost is chosen as the final result. The preliminary experiments show that the segmentation precision of the joint system based on hybrid approach is 94.06%, POS tagging precision 90.96%, and the recall and F-measure range from 96.86% to 95.44.0% and from 93.67% to 92.29% respectively. The Work of improving the performance of the system is still ongoing.