{"title":"Articles Classification in Myanmar Language","authors":"Myat Sapal Phyu, K. Nwet","doi":"10.1109/AITC.2019.8920927","DOIUrl":null,"url":null,"abstract":"Article classification is a problem of text classification to assign the articles to their corresponding class or topic. According to this work, there are two main barriers to classify Myanmar text in deep learning model, to find the proper way of determining the word boundaries and to build the datasets for Myanmar text classification. This paper shows the empirical evidence on article classification in Myanmar language for both syllable-level and word-level by fine-tuning Convolutional Neural Networks. They are denoted as Syllable-Level Convolutional Neural Networks (SL-CNN) and Word-Level Convolutional Neural Networks (WL-CNN). Although there are few publicly available general-purpose pre-trained vectors for Myanmar language that can be further applied to transfer learning, it is still needed to construct large-scale datasets for classifying Myanmar articles. We construct six datasets to classify Myanmar articles and evaluation is measured by the comparative analysis of these vectors on SL-CNN and WL-CNN with Recurrent Neural Networks for both syllable and word level, SL-RNN and WL-RNN.","PeriodicalId":388642,"journal":{"name":"2019 International Conference on Advanced Information Technologies (ICAIT)","volume":"283 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Advanced Information Technologies (ICAIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AITC.2019.8920927","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Article classification is a problem of text classification to assign the articles to their corresponding class or topic. According to this work, there are two main barriers to classify Myanmar text in deep learning model, to find the proper way of determining the word boundaries and to build the datasets for Myanmar text classification. This paper shows the empirical evidence on article classification in Myanmar language for both syllable-level and word-level by fine-tuning Convolutional Neural Networks. They are denoted as Syllable-Level Convolutional Neural Networks (SL-CNN) and Word-Level Convolutional Neural Networks (WL-CNN). Although there are few publicly available general-purpose pre-trained vectors for Myanmar language that can be further applied to transfer learning, it is still needed to construct large-scale datasets for classifying Myanmar articles. We construct six datasets to classify Myanmar articles and evaluation is measured by the comparative analysis of these vectors on SL-CNN and WL-CNN with Recurrent Neural Networks for both syllable and word level, SL-RNN and WL-RNN.