具有词汇和浅句法特征的印尼语共指称分辨实验

2017 5th International Conference on Information and Communication Technology (ICoIC7) Pub Date : 2017-05-01 DOI:10.1109/ICOICT.2017.8074648

Gilang Julian Suherik, A. Purwarianti

{"title":"具有词汇和浅句法特征的印尼语共指称分辨实验","authors":"Gilang Julian Suherik, A. Purwarianti","doi":"10.1109/ICOICT.2017.8074648","DOIUrl":null,"url":null,"abstract":"We built Indonesian coreference resolution that solves not only pronoun referenced to proper noun, but also proper noun to proper noun and pronoun to pronoun. The differences with the available Indonesian coreference resolution lay on the problem scope and features. We conducted experiments using various features (lexical and shallow syntactic features) such as appositive feature, nearest candidate feature, direct sentence feature, previous and next word feature, and a lexical feature of first person. We also modified the method to build the training set by selecting the negative examples by cross pairing every single markable that appear between antecedent and anaphor. Compared with two available methods to build the training set, we conducted experiments using C45 algorithm. Using 200 news sentences, the best experiment achieved 71.6% F-Measure score.","PeriodicalId":244500,"journal":{"name":"2017 5th International Conference on Information and Communication Technology (ICoIC7)","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Experiments on coreference resolution for Indonesian language with lexical and shallow syntactic features\",\"authors\":\"Gilang Julian Suherik, A. Purwarianti\",\"doi\":\"10.1109/ICOICT.2017.8074648\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We built Indonesian coreference resolution that solves not only pronoun referenced to proper noun, but also proper noun to proper noun and pronoun to pronoun. The differences with the available Indonesian coreference resolution lay on the problem scope and features. We conducted experiments using various features (lexical and shallow syntactic features) such as appositive feature, nearest candidate feature, direct sentence feature, previous and next word feature, and a lexical feature of first person. We also modified the method to build the training set by selecting the negative examples by cross pairing every single markable that appear between antecedent and anaphor. Compared with two available methods to build the training set, we conducted experiments using C45 algorithm. Using 200 news sentences, the best experiment achieved 71.6% F-Measure score.\",\"PeriodicalId\":244500,\"journal\":{\"name\":\"2017 5th International Conference on Information and Communication Technology (ICoIC7)\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 5th International Conference on Information and Communication Technology (ICoIC7)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOICT.2017.8074648\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 5th International Conference on Information and Communication Technology (ICoIC7)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOICT.2017.8074648","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

我们建立了印尼语的共指解析，不仅解决了代词对专有名词的引用，还解决了专有名词对专有名词和代词对代词的引用。与印度尼西亚现有的共同参考决议的不同之处在于问题的范围和特点。我们使用了各种特征(词汇和浅句法特征)，如同位语特征、最近候选特征、直接句特征、前一个词和下一个词特征以及第一人称词汇特征。我们还改进了该方法，通过交叉配对出现在先行词和参照词之间的每一个标记来选择负例来构建训练集。对比已有的两种构建训练集的方法，我们采用C45算法进行了实验。使用200个新闻句子，最好的实验F-Measure得分达到71.6%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Experiments on coreference resolution for Indonesian language with lexical and shallow syntactic features

We built Indonesian coreference resolution that solves not only pronoun referenced to proper noun, but also proper noun to proper noun and pronoun to pronoun. The differences with the available Indonesian coreference resolution lay on the problem scope and features. We conducted experiments using various features (lexical and shallow syntactic features) such as appositive feature, nearest candidate feature, direct sentence feature, previous and next word feature, and a lexical feature of first person. We also modified the method to build the training set by selecting the negative examples by cross pairing every single markable that appear between antecedent and anaphor. Compared with two available methods to build the training set, we conducted experiments using C45 algorithm. Using 200 news sentences, the best experiment achieved 71.6% F-Measure score.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 5th International Conference on Information and Communication Technology (ICoIC7)

自引率

0.00%

发文量