Kui Wu, Xuancong Wang, Nina Zhou, AiTi Aw, Haizhou Li
{"title":"基于深度递归神经网络的中文分词和标点符号联合预测","authors":"Kui Wu, Xuancong Wang, Nina Zhou, AiTi Aw, Haizhou Li","doi":"10.1109/IALP.2015.7451527","DOIUrl":null,"url":null,"abstract":"In this work, we propose to jointly perform Chinese word segmentation (CWS) and punctuation prediction (PU) in a unified framework using deep recurrent neural network (DRNN). We further perform a comparative study among the joint frameworks, the isolated prediction and the pipeline methods that link the two tasks sequentially, on a social media corpus. Our experimental results show that joint models improve performance of CWS and affect PU marginally. We also study the effects of CWS and PU on Chinese-to-English machine translation (MT) quality by evaluating on a parallel social media corpus. It is shown that joint models are superior to the isolated prediction and the pipeline approaches.","PeriodicalId":256927,"journal":{"name":"2015 International Conference on Asian Language Processing (IALP)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Joint Chinese word segmentation and punctuation prediction using deep recurrent neural network for social media data\",\"authors\":\"Kui Wu, Xuancong Wang, Nina Zhou, AiTi Aw, Haizhou Li\",\"doi\":\"10.1109/IALP.2015.7451527\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, we propose to jointly perform Chinese word segmentation (CWS) and punctuation prediction (PU) in a unified framework using deep recurrent neural network (DRNN). We further perform a comparative study among the joint frameworks, the isolated prediction and the pipeline methods that link the two tasks sequentially, on a social media corpus. Our experimental results show that joint models improve performance of CWS and affect PU marginally. We also study the effects of CWS and PU on Chinese-to-English machine translation (MT) quality by evaluating on a parallel social media corpus. It is shown that joint models are superior to the isolated prediction and the pipeline approaches.\",\"PeriodicalId\":256927,\"journal\":{\"name\":\"2015 International Conference on Asian Language Processing (IALP)\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-10-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 International Conference on Asian Language Processing (IALP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IALP.2015.7451527\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference on Asian Language Processing (IALP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IALP.2015.7451527","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Joint Chinese word segmentation and punctuation prediction using deep recurrent neural network for social media data
In this work, we propose to jointly perform Chinese word segmentation (CWS) and punctuation prediction (PU) in a unified framework using deep recurrent neural network (DRNN). We further perform a comparative study among the joint frameworks, the isolated prediction and the pipeline methods that link the two tasks sequentially, on a social media corpus. Our experimental results show that joint models improve performance of CWS and affect PU marginally. We also study the effects of CWS and PU on Chinese-to-English machine translation (MT) quality by evaluating on a parallel social media corpus. It is shown that joint models are superior to the isolated prediction and the pipeline approaches.