{"title":"基于马尔可夫链的中文分词改进算法研究","authors":"Pang Baomao, Shi Haoshan","doi":"10.1109/IAS.2009.317","DOIUrl":null,"url":null,"abstract":"Chinese words segmentation is an important technique for Chinese web data mining. After the research made on some Chinese word segmentation nowadays, an improved algorithm is proposed in this paper. The algorithm updates dictionary by using Two-way Markov Chain, and does word segmentation by applying an improved Forward Maximum Matching Method based on word frequency statistic. The simulation shows this algorithm can finish word segmentation for a given text quickly and accurately.","PeriodicalId":240354,"journal":{"name":"2009 Fifth International Conference on Information Assurance and Security","volume":"121 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Research on Improved Algorithm for Chinese Word Segmentation Based on Markov Chain\",\"authors\":\"Pang Baomao, Shi Haoshan\",\"doi\":\"10.1109/IAS.2009.317\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Chinese words segmentation is an important technique for Chinese web data mining. After the research made on some Chinese word segmentation nowadays, an improved algorithm is proposed in this paper. The algorithm updates dictionary by using Two-way Markov Chain, and does word segmentation by applying an improved Forward Maximum Matching Method based on word frequency statistic. The simulation shows this algorithm can finish word segmentation for a given text quickly and accurately.\",\"PeriodicalId\":240354,\"journal\":{\"name\":\"2009 Fifth International Conference on Information Assurance and Security\",\"volume\":\"121 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 Fifth International Conference on Information Assurance and Security\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IAS.2009.317\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 Fifth International Conference on Information Assurance and Security","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IAS.2009.317","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on Improved Algorithm for Chinese Word Segmentation Based on Markov Chain
Chinese words segmentation is an important technique for Chinese web data mining. After the research made on some Chinese word segmentation nowadays, an improved algorithm is proposed in this paper. The algorithm updates dictionary by using Two-way Markov Chain, and does word segmentation by applying an improved Forward Maximum Matching Method based on word frequency statistic. The simulation shows this algorithm can finish word segmentation for a given text quickly and accurately.