{"title":"The Research on Chinese Coreference Resolution Based on Support Vector Machines","authors":"Yihao Zhang, Peng Jin","doi":"10.1109/ICGEC.2010.49","DOIUrl":null,"url":null,"abstract":"Coreference is a common linguistic phenomenon in natural language understanding, it plays an important role in simplifying the expression and linking up the context. In this paper, the algorithm of support vector machines is applied to solve the problem of Chinese coreference, we consider fully the important characteristics which related to coreference and integrate them effectively to build model. In the handling of training data, using data scaling techniques balance the range of characteristic values, and use cross validation to optimize the training parameters of the model. The experimental results show that the F-score of positive instances and negative instances reached 76.80% and 90.91% respectively on the classification model in Lancaster Corpus of Mandarin Chinese.","PeriodicalId":373949,"journal":{"name":"2010 Fourth International Conference on Genetic and Evolutionary Computing","volume":"144 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Fourth International Conference on Genetic and Evolutionary Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICGEC.2010.49","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Coreference is a common linguistic phenomenon in natural language understanding, it plays an important role in simplifying the expression and linking up the context. In this paper, the algorithm of support vector machines is applied to solve the problem of Chinese coreference, we consider fully the important characteristics which related to coreference and integrate them effectively to build model. In the handling of training data, using data scaling techniques balance the range of characteristic values, and use cross validation to optimize the training parameters of the model. The experimental results show that the F-score of positive instances and negative instances reached 76.80% and 90.91% respectively on the classification model in Lancaster Corpus of Mandarin Chinese.