{"title":"Improved parsing with taxonomy of conjunctions","authors":"Dongchen Li, Xiantao Zhang, Xihong Wu","doi":"10.1109/ChinaSIP.2014.6889199","DOIUrl":null,"url":null,"abstract":"Incorporating knowledge for training a parser has been shown to remedy the weaknesses of probabilistic context-free grammar. Previous parsing systems have exploited content words semantic resource and word-formation knowledge. However, they are limited in that they do not take into account conjunction category refinement, which stands out to be helpful in predicting the syntactic structure and syntactic label in Chinese. We define a conjunction taxonomy representing intrinsic syntactic constraints, and show that refined categories in the taxonomy for conjunctions contribute to improved parsing performance. The taxonomy is used to supervise the splitting of these refined tags, and the automatic hierarchical state-split approach is employ to compensate the limitation in the scope and refinement degree of the taxonomy. The experiments are carried out on Penn Chinese Treebank, which show that our method can improve parsing performance significantly.","PeriodicalId":248977,"journal":{"name":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE China Summit & International Conference on Signal and Information Processing (ChinaSIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ChinaSIP.2014.6889199","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Incorporating knowledge for training a parser has been shown to remedy the weaknesses of probabilistic context-free grammar. Previous parsing systems have exploited content words semantic resource and word-formation knowledge. However, they are limited in that they do not take into account conjunction category refinement, which stands out to be helpful in predicting the syntactic structure and syntactic label in Chinese. We define a conjunction taxonomy representing intrinsic syntactic constraints, and show that refined categories in the taxonomy for conjunctions contribute to improved parsing performance. The taxonomy is used to supervise the splitting of these refined tags, and the automatic hierarchical state-split approach is employ to compensate the limitation in the scope and refinement degree of the taxonomy. The experiments are carried out on Penn Chinese Treebank, which show that our method can improve parsing performance significantly.