Myint Myint Htay, Ye Kyaw Thu, Hninn Aye Thant, T. Supnithi
{"title":"缅甸语释义生成的统计机器翻译","authors":"Myint Myint Htay, Ye Kyaw Thu, Hninn Aye Thant, T. Supnithi","doi":"10.1109/iSAI-NLP51646.2020.9376783","DOIUrl":null,"url":null,"abstract":"In this paper, we applied a statistical machine translation (SMT) approach to generate Burmese paraphrases of input sentences and words in Burmese. The system trained 89K sentence pairs that are manually collected from Facebook Comments and daily conversation corpus and also 89K Burmese Paraphrase Words are collected from Burmese Wiktionary. We implemented three different statistical machine translation models; phrase-based, hierarchical phrase based, and the operation sequence model. Moreover, we used two segmentation units; character and syllable segmentation for comparing the machine translation performance. The performance of machine translation or paraphrase generation was measured in terms of BLEU, RIBES, chrF++, and WER scores for all experiments. However, automatic evaluation metrics are weak for judging whether the generated Burmese sentences and words “is a paraphrase” or “is not a paraphrase’: And thus, we also conducted a human evaluation on both sentence-to-sentence and word-toword paraphrase generation results. We found that the results obtained using the BLEU and RIBES automatic evaluation metrics were misleading and as the human evaluation result the machine translation approach is suitable for Burmese paraphrase generation.","PeriodicalId":311014,"journal":{"name":"2020 15th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)","volume":"90 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Statistical Machine Translation for Myanmar Language Paraphrase Generation\",\"authors\":\"Myint Myint Htay, Ye Kyaw Thu, Hninn Aye Thant, T. Supnithi\",\"doi\":\"10.1109/iSAI-NLP51646.2020.9376783\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we applied a statistical machine translation (SMT) approach to generate Burmese paraphrases of input sentences and words in Burmese. The system trained 89K sentence pairs that are manually collected from Facebook Comments and daily conversation corpus and also 89K Burmese Paraphrase Words are collected from Burmese Wiktionary. We implemented three different statistical machine translation models; phrase-based, hierarchical phrase based, and the operation sequence model. Moreover, we used two segmentation units; character and syllable segmentation for comparing the machine translation performance. The performance of machine translation or paraphrase generation was measured in terms of BLEU, RIBES, chrF++, and WER scores for all experiments. However, automatic evaluation metrics are weak for judging whether the generated Burmese sentences and words “is a paraphrase” or “is not a paraphrase’: And thus, we also conducted a human evaluation on both sentence-to-sentence and word-toword paraphrase generation results. We found that the results obtained using the BLEU and RIBES automatic evaluation metrics were misleading and as the human evaluation result the machine translation approach is suitable for Burmese paraphrase generation.\",\"PeriodicalId\":311014,\"journal\":{\"name\":\"2020 15th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)\",\"volume\":\"90 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 15th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/iSAI-NLP51646.2020.9376783\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 15th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iSAI-NLP51646.2020.9376783","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Statistical Machine Translation for Myanmar Language Paraphrase Generation
In this paper, we applied a statistical machine translation (SMT) approach to generate Burmese paraphrases of input sentences and words in Burmese. The system trained 89K sentence pairs that are manually collected from Facebook Comments and daily conversation corpus and also 89K Burmese Paraphrase Words are collected from Burmese Wiktionary. We implemented three different statistical machine translation models; phrase-based, hierarchical phrase based, and the operation sequence model. Moreover, we used two segmentation units; character and syllable segmentation for comparing the machine translation performance. The performance of machine translation or paraphrase generation was measured in terms of BLEU, RIBES, chrF++, and WER scores for all experiments. However, automatic evaluation metrics are weak for judging whether the generated Burmese sentences and words “is a paraphrase” or “is not a paraphrase’: And thus, we also conducted a human evaluation on both sentence-to-sentence and word-toword paraphrase generation results. We found that the results obtained using the BLEU and RIBES automatic evaluation metrics were misleading and as the human evaluation result the machine translation approach is suitable for Burmese paraphrase generation.