{"title":"Experiments on coreference resolution for Indonesian language with lexical and shallow syntactic features","authors":"Gilang Julian Suherik, A. Purwarianti","doi":"10.1109/ICOICT.2017.8074648","DOIUrl":null,"url":null,"abstract":"We built Indonesian coreference resolution that solves not only pronoun referenced to proper noun, but also proper noun to proper noun and pronoun to pronoun. The differences with the available Indonesian coreference resolution lay on the problem scope and features. We conducted experiments using various features (lexical and shallow syntactic features) such as appositive feature, nearest candidate feature, direct sentence feature, previous and next word feature, and a lexical feature of first person. We also modified the method to build the training set by selecting the negative examples by cross pairing every single markable that appear between antecedent and anaphor. Compared with two available methods to build the training set, we conducted experiments using C45 algorithm. Using 200 news sentences, the best experiment achieved 71.6% F-Measure score.","PeriodicalId":244500,"journal":{"name":"2017 5th International Conference on Information and Communication Technology (ICoIC7)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 5th International Conference on Information and Communication Technology (ICoIC7)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOICT.2017.8074648","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
We built Indonesian coreference resolution that solves not only pronoun referenced to proper noun, but also proper noun to proper noun and pronoun to pronoun. The differences with the available Indonesian coreference resolution lay on the problem scope and features. We conducted experiments using various features (lexical and shallow syntactic features) such as appositive feature, nearest candidate feature, direct sentence feature, previous and next word feature, and a lexical feature of first person. We also modified the method to build the training set by selecting the negative examples by cross pairing every single markable that appear between antecedent and anaphor. Compared with two available methods to build the training set, we conducted experiments using C45 algorithm. Using 200 news sentences, the best experiment achieved 71.6% F-Measure score.