2017 International Conference on Asian Language Processing (IALP)最新文献

筛选
英文 中文
Hybrid answer selection model for non-factoid question answering 非因素问答的混合答案选择模型
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300620
R. Ma, Jian Zhang, Miao Li, Lei Chen, Jingyang Gao
{"title":"Hybrid answer selection model for non-factoid question answering","authors":"R. Ma, Jian Zhang, Miao Li, Lei Chen, Jingyang Gao","doi":"10.1109/IALP.2017.8300620","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300620","url":null,"abstract":"Capturing the semantic associations between questions and answers is a challenging task for answer selection. In this paper, a hybrid answer selection model is proposed by combining Convolutional Neural Network (CNN) and abstract extraction methods. In the model, answer summarization is extracted from the text with multiple features, and sent to the CNN together with the question to obtain a concise and efficient semantic representation. Unlike previous deep models, irrelevant information is removed and better representations are generated for question and answer, which is necessary for non-factoid question answering. The results on two datasets InsuranceQA and Agriculture QA show that our model outperforms other single deep models.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115634886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Embedding wikipedia title based on its wikipedia text and categories 基于维基百科文本和分类嵌入维基百科标题
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300566
Chi-Yen Chen, Wei-Yun Ma
{"title":"Embedding wikipedia title based on its wikipedia text and categories","authors":"Chi-Yen Chen, Wei-Yun Ma","doi":"10.1109/IALP.2017.8300566","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300566","url":null,"abstract":"Distributed word representation is widely used in many NLP tasks and knowledge-based resources also provide valuable information. Comparing to conventional knowledge bases, Wikipedia provides semi-structural data other than structural data. We argue that a Wikipedia title's categories can help complement the title's meaning besides Wikipedia text, so the categories should be utilized to improve the title's embedding. We propose two directions of using categories, cooperating with conventional context-based approaches, to generate embeddings of Wikipedia titles. We conduct extensively large scale experiments on the generated title embeddings on Chinese Wikipedia. Experiments on word similarity task and analogical reasoning task show that our approaches significantly outperform conventional context-based approaches.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133750004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A simple yet effective method for summarizing microblogging users with their representative tweets 一个简单而有效的方法来总结微博用户的代表性tweets
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300605
Shuangyong Song, Yao Meng, Zhiwei Shi, Zhongguang Zheng, Haiqing Chen
{"title":"A simple yet effective method for summarizing microblogging users with their representative tweets","authors":"Shuangyong Song, Yao Meng, Zhiwei Shi, Zhongguang Zheng, Haiqing Chen","doi":"10.1109/IALP.2017.8300605","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300605","url":null,"abstract":"Fast diffusion of information makes microblogging an efficient platform for seeking information, and the most convenient way for a user to get useful information is following others and in real time the updates of the followed users will be presented automatically. Deciding to follow or not to follow a user is usually based on this user's tweet content, because users usually try to find friends who post information relevant to their interests. Since reading through all tweets of a user is very time-consuming, we need to design a convenient user summarization form. In this paper, we design a model for summarizing users with their most representative tweets. The experimental results on Sina-weibo, one of the most popular microblogging sites in China, show our model can get a better performance than baseline methods.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"188 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122869878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The automatic extraction of common-used adverbs for teaching Chinese as second language 汉语第二语言教学中常用副词的自动提取
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300568
Zhimin Wang, Mei-Chu Wang
{"title":"The automatic extraction of common-used adverbs for teaching Chinese as second language","authors":"Zhimin Wang, Mei-Chu Wang","doi":"10.1109/IALP.2017.8300568","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300568","url":null,"abstract":"This paper establishes the adverb statistical wordlist of Chinese teaching, by using the People's Daily and radio and television corpus within five years and designing the time span, statistical time point, stability, etc. The wordlist includes more than 40 statistical time points and describable change curve. Meanwhile, this paper improves the method which provides a valuable data for synonym discrimination and makes the performance much accord with people's experience. Finally, we analyze the top 26 common-used adverbs from the Graded Chinese Syllables according to the statistical wordlist. The result shows that 50% of the top 26 common-used adverbs in Graded Chinese Syllables are ranked between 50 and 174 position of the statistical wordlist. Thus there is much room for improving the common-used adverb ranking in the Graded Chinese Syllables.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125166075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating multi-task learning for automatic speech recognition with code-switching between mandarin and english 中文与英文语码转换自动语音识别的多任务学习研究
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300538
Xiao Song, Yuexian Zou, Shilei Huang, Shaobin Chen, Yi Y. Liu
{"title":"Investigating multi-task learning for automatic speech recognition with code-switching between mandarin and english","authors":"Xiao Song, Yuexian Zou, Shilei Huang, Shaobin Chen, Yi Y. Liu","doi":"10.1109/IALP.2017.8300538","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300538","url":null,"abstract":"This work investigates a Multi-task Learning (MTL-DNN) approach to enhance the performance of Mandarin-English code-switching conversational speech recognition (MECS-CSR). The approach aims at getting a better acoustic model for the primary task by jointly learning two auxiliary tasks together. To overcome the effect of co-articulation at code-switch points, under MTL-DNN, we propose to jointly train two types of Mandarin-English acoustic models according to the choice of acoustic units that describe the salient acoustic and phonetic information for Mandarin. To further make use of language information, we jointly train another acoustic model for language identification (LID) with the two acoustic models under the MTL-DNN. To evaluate the effectiveness of our developed MECS-CSR system, extensive experiments are carried out on a public dataset LDC2015S04. It is noted that our approach does not require other language resources. Compared with the first basic MECS-CSR system [1], Mixed Error Rate (MER) of our proposed approach is relatively reduced by 12.49%. The performance improvement benefits from multi-task learning where the common internal representation is obtained from the auxiliary tasks learning.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126852758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A light-weight method of building an LSTM-RNN-based bilingual tts system 基于lstm - rnn的双语tts系统轻量级构建方法
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300579
Huaiping Ming, Yanfeng Lu, Zhengchen Zhang, M. Dong
{"title":"A light-weight method of building an LSTM-RNN-based bilingual tts system","authors":"Huaiping Ming, Yanfeng Lu, Zhengchen Zhang, M. Dong","doi":"10.1109/IALP.2017.8300579","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300579","url":null,"abstract":"For a long time, text-to-speech (TTS) synthesis systems could only handle one language. Early bilingual TTS systems were constructed by directly combining two monolingual systems, with language switching. The bilingual speech generated by such systems normally contained two different voices, therefore causing unnatural, sometimes disturbing effects. A genuine bilingual TTS system should use a single voice and avoid switching between two independent monolingual systems. Accordingly, the difficulties of building genuine bilingual speech synthesizers lie in merging two different languages into the same system and preparing bilingual speech data with the same speaker. Various methods have been proposed to overcome these difficulties, including soft prosody prediction, phone, state and frame mapping, and most recently speaker and language factorization. Professional speakers who can speak two languages fluently are hard to find. In many cases a speaker can speak one language well, but the second only fairly. In this paper we propose an easy linguistic feature concatenation method to build a bilingual TTS system with data created by such a speaker, using an LSTM-RNN-based speech synthesizer. Both objective and subjective evaluations show the effectiveness of this method.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121534992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Controlling byte pair encoding for neural machine translation 控制字节对编码的神经机器翻译
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300571
Alfred John Tacorda, Marvin John Ignacio, Nathaniel Oco, R. Roxas
{"title":"Controlling byte pair encoding for neural machine translation","authors":"Alfred John Tacorda, Marvin John Ignacio, Nathaniel Oco, R. Roxas","doi":"10.1109/IALP.2017.8300571","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300571","url":null,"abstract":"Byte pair encoding(BPE) is an approach that segments the corpus in such a way that frequent sequence of characters are combined; it results to having word surface forms divided into its' root word and affix. It alone handles out-of-vocabulary words, but tends to not consistently segment inflected words. Controlled byte pair encoding (CBPE) allowed our word-level neural machine translation (NMT) model to easily recognize inflected words which are prevalent in morphologically-rich languages. It prevented BPE from merging affixes in a word to other characters in the word. Our resulting NMT models from CBPE consistently evaluates affixes that could've been segmented with variations in BPE. In our experiments, we considered 119,969 English-Filipino parallel language pairs from an existing dataset, with Filipino as a morphologically-rich language. The results show that BPE and CBPE both showed improvements in the BLEU scores from 38.31 to 44.82 and 44.07 for English→Filipino, and from 32.17 to 35.25 and 35.98 for Filipino→English, respectively. The lower scores in the Filipino→English can be attributed to other language characteristics of Filipino such as free word order, one-to-many relationship in translating from English to Filipino, and some transliterations in the parallel corpus. CBPE also performed slightly better for English→Filipino than for Filipino→English.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129044774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Dynamic topic mining for microblog fused with user's behavior and time window 微博动态主题挖掘融合了用户行为和时间窗口
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300535
Fei Wu, Zhuo Wang, Zhengtao Yu, Liren Wang, Feng Zhou
{"title":"Dynamic topic mining for microblog fused with user's behavior and time window","authors":"Fei Wu, Zhuo Wang, Zhengtao Yu, Liren Wang, Feng Zhou","doi":"10.1109/IALP.2017.8300535","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300535","url":null,"abstract":"Compared with traditional text, microblog text has features of user behavior and time window. Catered to features of microblog text, this paper proposed a method of dynamic topic mining for Microblog fused with user behavior and time window. Based on traditional LDA model, we use method of time window division to divide microblog text into each time window, then fuse features of user behavior in our model as guide information, sequentially construct dynamic topic mining for Microblog fused with user behavior and time window. Result of the experiment shows that the model we proposed has better effect on topic in microblog analyzing and topic intensity changing with time.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116608286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Neural machine translation for sinhala and tamil languages 神经机器翻译僧伽罗语和泰米尔语
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300576
Pasindu Tennage, Prabath Sandaruwan, Malith Thilakarathne, Achini Herath, Surangika Ranathunga, Sanath Jayasena, G. Dias
{"title":"Neural machine translation for sinhala and tamil languages","authors":"Pasindu Tennage, Prabath Sandaruwan, Malith Thilakarathne, Achini Herath, Surangika Ranathunga, Sanath Jayasena, G. Dias","doi":"10.1109/IALP.2017.8300576","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300576","url":null,"abstract":"Neural Machine Translation (NMT) is becoming the current state of the art machine translation technique. Although NMT is successful for resourceful languages, its applicability in low-resource settings is still debatable. In this paper, we address the task of developing a NMT system for the most widely used language pair in Sri Lanka-Sinhala and Tamil, focusing on the domain of official government documents. We explore the ways of improving NMT using word phrases in a situation where the size of the parallel corpus is considerably small, and empirically show that the resulting models improve our benchmark domain specific Sinhala to Tamil and Tamil to Sinhala translation models by 0.68 and 5.4 BLEU, respectively. The paper also presents an analysis on how NMT performance varies with the amount of word phrases, in order to investigate the effects of word phrases in domain specific NMT.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116898243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Improving air traffic control speech intelligibility by reducing speaking rate effectively 有效降低话音率,提高空管语音清晰度
2017 International Conference on Asian Language Processing (IALP) Pub Date : 2017-12-01 DOI: 10.1109/IALP.2017.8300578
Nana Hou, Xiaohai Tian, Chng Eng Siong, B. Ma, Haizhou Li
{"title":"Improving air traffic control speech intelligibility by reducing speaking rate effectively","authors":"Nana Hou, Xiaohai Tian, Chng Eng Siong, B. Ma, Haizhou Li","doi":"10.1109/IALP.2017.8300578","DOIUrl":"https://doi.org/10.1109/IALP.2017.8300578","url":null,"abstract":"Low intelligibility of Air Traffic Control (ATC) speech is one major cause of aircraft accidents every year. Many factors can affect speech intelligibility, among which the most prominent aspects is the high speaking rate commonly present in ATC speech. Hence, a possible solution would be to improve intelligibility by artificially lengthening the spoken utterance to lower the speaking rate. In this work, we explore the lengthening of clean recorded ATC utterances by first identifying phoneme sequences in a given utterance. Such identified phoneme segments can then be lengthened. We will examine effects of lengthening vowels-only, consonants-only, or homogeneous lengthening. To verify our approach, we will conduct human listening test to evaluate the intelligibility. The results show 74.67% was obtained in AB preference test.","PeriodicalId":183586,"journal":{"name":"2017 International Conference on Asian Language Processing (IALP)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133422903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信