{"title":"缅甸语命名实体识别的比较研究","authors":"Tin Latt Nandar, Thinn Lai Soe, K. Soe","doi":"10.1109/O-COCOSDA50338.2020.9295004","DOIUrl":null,"url":null,"abstract":"This paper represents the development of the Myanmar Named Entity Recognition (NER) system using Conditional Random Fields (CRFs). In order to develop the system, a manually annotated Named Entities (NEs) corpus - collected from Myanmar news websites and Asia Language Treebank(ALT)-Parallel-Corpus has been used. We compare the performance of the system getting syllable-based input to the one getting character-based input. We observed that training data has more impact on the performance of the system. The experimental results show that the syllable-based system performs better than the character-based system. It achieves that Precision, Recall and F1-score values of 93.62%, 91.64% and 92.62% respectively.","PeriodicalId":385266,"journal":{"name":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Comparative Study of Named Entity Recognition on Myanmar Language\",\"authors\":\"Tin Latt Nandar, Thinn Lai Soe, K. Soe\",\"doi\":\"10.1109/O-COCOSDA50338.2020.9295004\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper represents the development of the Myanmar Named Entity Recognition (NER) system using Conditional Random Fields (CRFs). In order to develop the system, a manually annotated Named Entities (NEs) corpus - collected from Myanmar news websites and Asia Language Treebank(ALT)-Parallel-Corpus has been used. We compare the performance of the system getting syllable-based input to the one getting character-based input. We observed that training data has more impact on the performance of the system. The experimental results show that the syllable-based system performs better than the character-based system. It achieves that Precision, Recall and F1-score values of 93.62%, 91.64% and 92.62% respectively.\",\"PeriodicalId\":385266,\"journal\":{\"name\":\"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)\",\"volume\":\"82 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/O-COCOSDA50338.2020.9295004\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 23rd Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/O-COCOSDA50338.2020.9295004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Comparative Study of Named Entity Recognition on Myanmar Language
This paper represents the development of the Myanmar Named Entity Recognition (NER) system using Conditional Random Fields (CRFs). In order to develop the system, a manually annotated Named Entities (NEs) corpus - collected from Myanmar news websites and Asia Language Treebank(ALT)-Parallel-Corpus has been used. We compare the performance of the system getting syllable-based input to the one getting character-based input. We observed that training data has more impact on the performance of the system. The experimental results show that the syllable-based system performs better than the character-based system. It achieves that Precision, Recall and F1-score values of 93.62%, 91.64% and 92.62% respectively.