2019 International Conference on Asian Language Processing (IALP)最新文献

筛选
英文 中文
Character Decomposition for Japanese-Chinese Character-Level Neural Machine Translation 日中字符级神经机器翻译的字符分解
2019 International Conference on Asian Language Processing (IALP) Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037677
Jinyi Zhang, Tadahiro Matsumoto
{"title":"Character Decomposition for Japanese-Chinese Character-Level Neural Machine Translation","authors":"Jinyi Zhang, Tadahiro Matsumoto","doi":"10.1109/IALP48816.2019.9037677","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037677","url":null,"abstract":"After years of development, Neural Machine Translation (NMT) has produced richer translation results than ever over various language pairs, becoming a new machine translation model with great potential. For the NMT model, it can only translate words/characters contained in the training data. One problem on NMT is handling of the low-frequency words/characters in the training data. In this paper, we propose a method for removing characters whose frequencies of appearance are less than a given minimum threshold by decomposing such characters into their components and/or pseudo-characters, using the Chinese character decomposition table we made. Experiments of Japanese-to-Chinese and Chinese-to-Japanese NMT with ASPEC-JC (Asian Scientific Paper Excerpt Corpus, Japanese-Chinese) corpus show that the BLEU scores, the training time and the number of parameters are varied with the number of the given minimum thresholds of decomposed characters.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134187956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
How to Answer Comparison Questions 如何回答比较问题
2019 International Conference on Asian Language Processing (IALP) Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037729
Hongxuan Tang, Yu Hong, Xin Chen, Kaili Wu, Min Zhang
{"title":"How to Answer Comparison Questions","authors":"Hongxuan Tang, Yu Hong, Xin Chen, Kaili Wu, Min Zhang","doi":"10.1109/IALP48816.2019.9037729","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037729","url":null,"abstract":"“Which city has the larger population, Tokyo or New York?”. To answer the question, in general, we necessarily obtain the prior knowledge about the populations of both cities, and accordingly determine the answer by numeric comparison. Using Machine Reading Comprehension (MRC) to answer such a question has become a popular research topic, which is referred to as a task of Comparison Question Answering (CQA). In this paper, we propose a novel neural CQA model which is trained to answer comparison question. The model is designed as a sophisticated neural network which performs inference in a step-by-step pipeline, including the steps of attentive entity detection (e.g., “city”), alignment of comparable attributes (e.g., “population” of the target “cities”), contrast calculation (larger or smaller), as well as binary classification of positive and negative answers. The experimentation on HotpotQA illustrates that the proposed method achieves an average F1 score of 63.09%, outperforming the baseline with about 10% F1 scores. In addition, it performs better than a series of competitive models, including DecompRC, BERT.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"110 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115758401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Enhancement of Malay Social Media Text Normalization for Lexicon-Based Sentiment Analysis 马来语社交媒体文本规范化在基于词汇的情感分析中的改进
2019 International Conference on Asian Language Processing (IALP) Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037700
Muhammad Fakhrur Razi Abu Bakar, N. Idris, Liyana Shuib
{"title":"An Enhancement of Malay Social Media Text Normalization for Lexicon-Based Sentiment Analysis","authors":"Muhammad Fakhrur Razi Abu Bakar, N. Idris, Liyana Shuib","doi":"10.1109/IALP48816.2019.9037700","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037700","url":null,"abstract":"Nowadays, most Malaysians use social media such as Twitter to express their opinions toward any latest issues publicly. However, user individuality and creativity of language create huge volumes of noisy words which become unsuitable as dataset for any Natural Language Processing applications such as sentiment analysis due to the irregularity of the language featured. Thus, it is important to convert these noisy words into their standard forms. Currently, there are limited studies to normalize the noisy words for Malay language. Hence, the aim of this study is to propose an enhancement of Malay social media text normalization for lexicon-based sentiment analysis. This normalizer comprises six main modules: (1) advanced tokenization, (2) Malay/English token detection, (3) lexical rules, (4) noisy token replacement, (5) n-gram, and (6) detokenization. The evaluation has been conducted and the findings show that 83.55% achieved in Precision and 84.61% in Recall.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116147733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Acoustic Scene Classification Using Deep Convolutional Neural Network via Transfer Learning 基于迁移学习的深度卷积神经网络声学场景分类
2019 International Conference on Asian Language Processing (IALP) Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037692
Min Ye, Hong Zhong, Xiao Song, Shilei Huang, Gang Cheng
{"title":"Acoustic Scene Classification Using Deep Convolutional Neural Network via Transfer Learning","authors":"Min Ye, Hong Zhong, Xiao Song, Shilei Huang, Gang Cheng","doi":"10.1109/IALP48816.2019.9037692","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037692","url":null,"abstract":"We use deep convolutional neural network via transfer learning for Acoustic Scene Classification (ASC). For this purpose, a powerful and popular deep learning architecture — Residual Neural Network (Resnet) is adopted. Transfer learning is used to fine-tune the pre-trained Resnet model on the TUT Urban Acoustic Scenes 2018 dataset. Furthermore, the focal loss is used to improve overall performance. In order to reduce the chance of overfitting, data augmentation technique is applied based on mixup. Our best system has achieved an improvement of more than 10% in terms of class-wise accuracy with respect to the Detection and classification of acoustic scenes and events (DCASE) 2018 baseline system on the TUT Urban Acoustic Scenes 2018 dataset.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121324613","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Sinhala and Tamil Speech Intent Identification From English Phoneme Based ASR 基于英语音素ASR的僧伽罗语和泰米尔语语音意图识别
2019 International Conference on Asian Language Processing (IALP) Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037702
Yohan Karunanayake, Uthayasanker Thayasivam, Surangika Ranathunga
{"title":"Sinhala and Tamil Speech Intent Identification From English Phoneme Based ASR","authors":"Yohan Karunanayake, Uthayasanker Thayasivam, Surangika Ranathunga","doi":"10.1109/IALP48816.2019.9037702","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037702","url":null,"abstract":"Today we can find many use cases for content-based speech classification. These include speech topic identification and spoken command recognition. Automatic Speech Recognition (ASR) sits underneath all of these applications to convert speech into textual format. However, creating an ASR system for a language is a resource-consuming task. Even though there are more than 6000 languages, all of these speech-related applications are limited to the most well-known languages such as English, because of the availability of data. There is some past research that looked into classifying speech while addressing the data scarcity. However, all of these methods have their own limitations. In this paper, we present an English language phoneme based speech intent classification methodology for Sinhala and Tamil languages. We use a pre-trained English ASR model to generate phoneme probability features and use them to identify intents of utterances expressed in Sinhala and Tamil, for which a rather small speech dataset is available. The experiment results show that the proposed method can have more than 80% accuracy for a 0.5-hour limited speech dataset in both languages.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121422077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Design and Implementation of Burmese Speech Synthesis System Based on HMM-DNN 基于HMM-DNN的缅甸语语音合成系统的设计与实现
2019 International Conference on Asian Language Processing (IALP) Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037731
Mengyuan Liu, Jian Yang
{"title":"Design and Implementation of Burmese Speech Synthesis System Based on HMM-DNN","authors":"Mengyuan Liu, Jian Yang","doi":"10.1109/IALP48816.2019.9037731","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037731","url":null,"abstract":"The research and application of speech synthesis in Chinese and English are widely used. However, most nonuniversal languages have relatively few electronic language resources, and speech synthesis research is lagging behind. Burmese is a type of alphabetic writing, and Burmese belongs to Tibetan-Burmese branch of the Sino-Tibetan language. In order to develop the Burmese speech synthesis application system, this paper studies the Burmese speech waveform synthesis method, designs and implements a HMM-based Burmese speech synthesis baseline system, and based on this, introduces a deep neural network (DNN) to replace the decision tree model of HMM speech synthesis system, thereby improving the acoustic model to improve the quality of speech synthesis. The experimental results show that the baseline system is feasible, and the introduction of DNN speech synthesis system can effectively improve the quality of speech synthesis.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128427279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Learning Deep Matching-Aware Network for Text Recommendation using Clickthrough Data 使用点击数据学习文本推荐的深度匹配感知网络
2019 International Conference on Asian Language Processing (IALP) Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037682
Haonan Liu, Nankai Lin, Zitao Chen, Ke Li, Sheng-yi Jiang
{"title":"Learning Deep Matching-Aware Network for Text Recommendation using Clickthrough Data","authors":"Haonan Liu, Nankai Lin, Zitao Chen, Ke Li, Sheng-yi Jiang","doi":"10.1109/IALP48816.2019.9037682","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037682","url":null,"abstract":"With the trend of information globalization, the volume of text information is exploding, which results in the information overload problem. Text recommendation system has shown to be a valuable tool to help users in such situations of information overload. In general, most researchers define text recommendation as a static problem, ignoring sequential information. In this paper, we propose a text recommendation framework with matching-aware interest extractor and dynamic interest extractor. We apply the Attention-based Long Short-Term Memory Network (LSTM) to model a user’ s dynamic interest. Besides, we model a user’ s static interest with the idea of semantic matching. We integrate dynamic interest and static interest of users’ and decide whether to recommend a text. We also propose a reasonable method to construct a text recommendation dataset with clickthrough data from CCIR 2018 shared task Personal Recommendation. We test our model and other baseline models on the dataset. The experiment shows our model outperforms all the baseline models and a state-of-the-art model, and the Fl-score of our model reaches 0.76.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126418842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
combination of Semantic Relatedness with Supervised Method for Word Sense Disambiguation 语义关联与监督相结合的词义消歧方法
2019 International Conference on Asian Language Processing (IALP) Pub Date : 2019-11-01 DOI: 10.1109/IALP48816.2019.9037717
Qiaoli Zhou, Yuguang Meng
{"title":"combination of Semantic Relatedness with Supervised Method for Word Sense Disambiguation","authors":"Qiaoli Zhou, Yuguang Meng","doi":"10.1109/IALP48816.2019.9037717","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037717","url":null,"abstract":"We present a semi-supervised learning method at efficiently exploits semantic relatedness in order to incorporate sense knowledge into a word sense disambiguation model and to leverage system performance. We have presented sense relativeness algorithms which combine neural model learned from a generic embedding function for variable length contexts of target words on a POS-labeled text corpus, with sense-labeled data in the form of example sentences. This paper investigates the way of incorporating semantic relatedness in a word sense disambiguation setting and evaluates the method on some SensEval/SemEval lexical sample tasks. The obtained results show that such representations consistently improve the accuracy of the selective supervised WSD system.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133115486","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Tibetan word segmentation method based on CNN-BiLSTM-CRF model 基于CNN-BiLSTM-CRF模型的藏文分词方法
2019 International Conference on Asian Language Processing (IALP) Pub Date : 2018-11-01 DOI: 10.1109/IALP48816.2019.9037661
Lili Wang, Hongwu Yang, Xiaotian Xing, Yajing Yan
{"title":"Tibetan word segmentation method based on CNN-BiLSTM-CRF model","authors":"Lili Wang, Hongwu Yang, Xiaotian Xing, Yajing Yan","doi":"10.1109/IALP48816.2019.9037661","DOIUrl":"https://doi.org/10.1109/IALP48816.2019.9037661","url":null,"abstract":"We propose a Tibetan word segmentation method based on CNN-BiLSTM-CRF model that merely uses the characters of sentence as the input so that the method does not need large-scale corpus resources and manual features for training. Firstly, we use convolution neural network to train character vectors. Then the character vectors are searched through the character lookup table to form a matrix C by stacking searched results. Then the convolution operation between the matrix C and multiple filter matrices is carried out to obtain the character-level features of each Tibetan word by maximizing the pooling. We input the character vector into the BiLSTM-CRF model, which is suitable for Tibetan word segmentation through the highway network, for getting a Tibetan word segmentation model that is optimized by using the character vector and CRF model. For Tibetan language with rich morphology, fewer parameters and faster training time make this model better than BiLSTM-CRF model in the performance of character level. The experimental results show that character input is sufficient for language modeling. The robustness of Tibetan word segmentation is improved by the model that can achieves 95.17% of the F value.","PeriodicalId":208066,"journal":{"name":"2019 International Conference on Asian Language Processing (IALP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127994165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信