{"title":"A microblog dataset for tibetan sentiment analysis","authors":"Yong Cuo, X. Shi, Nyima Trashi, Yidong Chen","doi":"10.1109/IALP.2017.8300626","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300626","url":null,"abstract":"We introduce TSTD, a microblog dataset for Tibetan sentiment analysis which is publicly available. It consists of about 10K Tibetan microblogs and classified positive, negative and neutral. We present the properties of the dataset and run experiments for 3-way sentiment classification in Tibetan using feature-based and deep learning systems on the dataset.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116997546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gulnigar Mahmut, Mewlude Nijat, Rehmutulla Memet, A. Hamdulla
{"title":"Exploration of Chinese-Uyghur neural machine translation","authors":"Gulnigar Mahmut, Mewlude Nijat, Rehmutulla Memet, A. Hamdulla","doi":"10.1109/IALP.2017.8300573","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300573","url":null,"abstract":"Nowadays two people who speak different languages are able to communicate with real-time translation software. This is benefited from machine translation technology. In China, there are multiple languages with great diversity. Uyghur and Chinese are the official languages of Xinjiang Uyghur Autonomous Region, China, which makes it urgent to improve the quality of Chinese-Uyghur (Uyghur-Chinese) machine translation. Recently, Neural machine translation (NMT) has reached promising results for most language pairs. Therefore, in this work, we first briefly analyze the difficulties of Uyghur machine translation. And then study the performance of Chinese-Uyghur machine translation with a statistical framework (PBMT) and two neural network frameworks (NMT and M-NMT), respectively. As a result, we not only have a better understanding of Chinese-Uyghur machine translation but also get our baseline system.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132961789","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Variational grid setting network","authors":"Yu-Neng Chuang, Zi-Yu Huang, Yen-Lung Tsai","doi":"10.1109/IALP.2017.8300584","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300584","url":null,"abstract":"We propose a new neural network architecture for automatic generation of missing characters in a Chinese font set. We call the neural network architecture the Variational Grid Setting Network which is based on the variational autoencoder (VAE) with some tweaks. The neural network model is able to generate missing characters relatively large in size (256 × 256 pixels). Moreover, we show that one can use very few samples for training data set, and get a satisfied result.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"12 50","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131437609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On the use of machine translation-based approaches for vietnamese diacritic restoration","authors":"Thai-Hoang Pham, Xuan-Khoai Pham, Hong Phuong Le","doi":"10.1109/IALP.2017.8300596","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300596","url":null,"abstract":"This paper presents an empirical study of two machine translation-based approaches for Vietnamese diacritic restoration problem, including phrase-based and neural-based machine translation models. This is the first work that applies neural-based machine translation method to this problem and gives a thorough comparison to the phrase-based machine translation method which is the current state-of-the-art method for this problem. On a large dataset, the phrase-based approach has an accuracy of 97.32% while that of the neural-based approach is 96.15%. While the neural-based method has a slightly lower accuracy, it is about twice faster than the phrase-based method in terms of inference speed. Moreover, neural-based machine translation method has much room for future improvement such as incorporating pre-trained word embeddings and collecting more training data.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134185425","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}