2013 International Conference on Asian Language Processing最新文献

筛选
英文 中文
A Computer-Assist Algorithm to Detect Repetitive Stuttering Automatically 一种自动检测重复口吃的计算机辅助算法
2013 International Conference on Asian Language Processing Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.32
Junbo Zhang, Bin Dong, Yonghong Yan
{"title":"A Computer-Assist Algorithm to Detect Repetitive Stuttering Automatically","authors":"Junbo Zhang, Bin Dong, Yonghong Yan","doi":"10.1109/IALP.2013.32","DOIUrl":"https://doi.org/10.1109/IALP.2013.32","url":null,"abstract":"An algorithm to detect Chinese repetitive stuttering by computer is studied. According to the features of repetitions in Chinese stuttered speech, improvement solutions are provided based on the previous research findings. First, a multi-span looping forced alignment decoding networks is designed to detect multi-syllable repetitions in Chinese stuttered speech. Second, branch penalty factor is added in the networks to adjust decoding trend using recursive search in order to reduce the error from the complexity of the decoding networks. Finally, we rejudge the detected stutters by calculating confidence to improve the reliability of the detection result. The experimental results show that compared to previous algorithm, the proposed algorithm can improve system performance significantly, about 18% average detection error rate relatively.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115748751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Subjectivity Classification of Filipino Text with Features Based on Term Frequency -- Inverse Document Frequency 基于词频特征的菲文文本主体性分类——逆文献频率
2013 International Conference on Asian Language Processing Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.40
Ralph Vincent J. Regalado, Jenina L. Chua, J. L. Co, Thomas James Z. Tiam-Lee
{"title":"Subjectivity Classification of Filipino Text with Features Based on Term Frequency -- Inverse Document Frequency","authors":"Ralph Vincent J. Regalado, Jenina L. Chua, J. L. Co, Thomas James Z. Tiam-Lee","doi":"10.1109/IALP.2013.40","DOIUrl":"https://doi.org/10.1109/IALP.2013.40","url":null,"abstract":"Subjectivity classification classifies a given document if it contains subjective information or not, or identifies which portions of the document are subjective. This research reports a machine learning approach on document-level and sentence-level subjectivity classification of Filipino texts using existing machine learning algorithms such as C4.5, Naïve Bayes, k-Nearest Neighbor, and Support Vector Machine. For the document-level classification, result shows that Support Vector Machines gave the best result with 95.06% accuracy. While for the sentence-level classification, Naïve Baves gave the best result with 58.75% accuracy.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122157788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Using Mutual Information Criterion to Design an Effective Lexicon for Chinese Pinyin-to-Character Conversion 用互信息准则设计有效的汉语拼音文字转换词典
2013 International Conference on Asian Language Processing Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.37
Wei Li, Jin-Song Zhang, Yanlu Xie, Xiaoyun Wang, M. Nishida, Seiichi Yamamoto
{"title":"Using Mutual Information Criterion to Design an Effective Lexicon for Chinese Pinyin-to-Character Conversion","authors":"Wei Li, Jin-Song Zhang, Yanlu Xie, Xiaoyun Wang, M. Nishida, Seiichi Yamamoto","doi":"10.1109/IALP.2013.37","DOIUrl":"https://doi.org/10.1109/IALP.2013.37","url":null,"abstract":"Pinyin-to-character (P2C) conversion is mostly used to input Chinese characters into a computer. Its main problem is homophone words, which is solved through exploiting contextual information provided by lexicon and n-gram language model (LM). Our investigation about the state-of-the-art P2C technologies reveals that the methods of conventional optimization for them were almost based on minimizing text perplexity, however it is not directly related to the optimization of P2C performance. Therefore, we propose to use a new optimization criterion: mutual information (MI) between text corpus and its Pinyin script, to do self-supervised word segmentation, build a lexicon and estimate an n-gram LM, then use them to build P2C system. We realized the P2C system using newspaper corpus. Compared with the two baseline systems using handcrafted lexicon and perplexity based optimized lexicon, our system got relatively 19.7% and 10.3% error reductions on testing corpus respectively. The results show the efficiency of our proposal.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116479882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Tentative Study on Language Model Based Solution to Multiple Choice of CET-4 基于语言模型的大学英语四级选择题解法初探
2013 International Conference on Asian Language Processing Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.35
Zhihang Fan, Muyun Yang, T. Zhao, Sheng Li
{"title":"A Tentative Study on Language Model Based Solution to Multiple Choice of CET-4","authors":"Zhihang Fan, Muyun Yang, T. Zhao, Sheng Li","doi":"10.1109/IALP.2013.35","DOIUrl":"https://doi.org/10.1109/IALP.2013.35","url":null,"abstract":"The paper presents a language model based solution to the test item of Multiple Choice of CET-4. Trained on the web scale English language data, different n-grams are examined under a dynamic programming searching for the best answers. Experimental results indicate that both 4-gram and 5-gram model could generate an average of 81% precision for 16 test items.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125844006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mining Recipes in Microblog 挖掘微博秘方
2013 International Conference on Asian Language Processing Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.13
Shengyu Liu, Qingcai Chen, Shanshan Guan, Xiaolong Wang, Huimiao Shi
{"title":"Mining Recipes in Microblog","authors":"Shengyu Liu, Qingcai Chen, Shanshan Guan, Xiaolong Wang, Huimiao Shi","doi":"10.1109/IALP.2013.13","DOIUrl":"https://doi.org/10.1109/IALP.2013.13","url":null,"abstract":"Microblog, as an online communication platform, is becoming more and more popular. Users generate volumes of data everyday and the user generated content contains a lot of useful knowledge such as practical skills and technical expertise. This paper proposes a cross-data method to mine recipes in Microblog. In the proposed method, snippets of text relevant to recipes are firstly extracted from Baidu Encyclopedia. Secondly, the extracted snippets of text are used to train a domain-specific unigram language model. Thirdly, candidate recipes in Microblog are mined based on the unigram language model. Finally, some heuristic rules are used to identify real recipes from the candidate recipes. Experimental results show the effectiveness of the proposed method.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"130 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134554602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Annotation Scheme for Uyghur Dependency Treebank 维吾尔语依存树库标注方案
2013 International Conference on Asian Language Processing Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.56
Samat Mamitimin, Turgun Ibrahim, Marhaba Eli
{"title":"The Annotation Scheme for Uyghur Dependency Treebank","authors":"Samat Mamitimin, Turgun Ibrahim, Marhaba Eli","doi":"10.1109/IALP.2013.56","DOIUrl":"https://doi.org/10.1109/IALP.2013.56","url":null,"abstract":"The paper introduces a dependency annotation effort which aims to fully annotate an Uyghur corpus. It is the first attempt of its kind to develop a large scale tree-bank for Uyghur. In this paper, we provide the motivation for following the dependency theory as the annotation scheme and argue that the dependency grammar is better suited to model the various linguistic phenomena in Uyghur. In our solution, the syntactic relations are encoded as labeled dependency relations among segments of lexical items and sequence of inflectional groups separated by derivational boundaries. We present the basic annotation scheme including morphological and syntactically dependency relation. We also show how the scheme handles some phenomenon such as omissions in copula sentences, punctuations and coordinations, etc.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116846973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Multi-thread Multi-keywords Matching Approach for Uyghur Text 维吾尔语文本多线程多关键词匹配方法
2013 International Conference on Asian Language Processing Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.36
Xinyuan Zhao, Adili Abuliz
{"title":"Multi-thread Multi-keywords Matching Approach for Uyghur Text","authors":"Xinyuan Zhao, Adili Abuliz","doi":"10.1109/IALP.2013.36","DOIUrl":"https://doi.org/10.1109/IALP.2013.36","url":null,"abstract":"Keywords matching is a preliminary means in public opinion analysis. Uyghur language is an agglutinative language, which words can be attaching by suffixes to express different semantic or syntactic in the text. Therefore, traditional matching algorithm can not be applied directly to the Uyghur text due to the Uyghur words have different surface forms in the text. In this paper, we implement a multi-keywords matching algorithm based on automaton for Uyghur text. The algorithm handles the inflection suffixes and the weakening of vowel letter in the word by use of reseverse suffixes automata and weakening of vowel restoration automata. By classification the keywords automata on the first letter of each keyword, a general multi-thread keywords matching approach for Uyghur also be proposed.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"354 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131287664","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Speech Recognition Research on Uyghur Accent Spoken Language 维吾尔口音口语语音识别研究
2013 International Conference on Asian Language Processing Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.52
Yating Yang, Bo Ma, Xinyu Tang, Osman Turghun
{"title":"Speech Recognition Research on Uyghur Accent Spoken Language","authors":"Yating Yang, Bo Ma, Xinyu Tang, Osman Turghun","doi":"10.1109/IALP.2013.52","DOIUrl":"https://doi.org/10.1109/IALP.2013.52","url":null,"abstract":"This research focus on the problem of Uygur language speech recognition with the accent spoken language. The recognition rate is not high enough, when recognizing the spoken language with pronunciation variation based on the recognition system of standard spoken language. We propose a Speech Recognition framework based on Uighur Accent Spoken Language, analyze acoustic characteristics, describe the phenomenon of pronunciation variation of Uyghur and create the acoustic model and the multi-pronunciation dictionary. The preliminary experimental results showed the capability of the proposed method improved the performance of the Uyghur continuous speech recognition.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"91 27","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131878028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rule Refinement for Spoken Language Translation by Retrieving the Missing Translation of Content Words 基于内容词缺失翻译检索的口语翻译规则优化
2013 International Conference on Asian Language Processing Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.23
Linfeng Song, Jun Xie, Xing Wang, Yajuan Lü, Qun Liu
{"title":"Rule Refinement for Spoken Language Translation by Retrieving the Missing Translation of Content Words","authors":"Linfeng Song, Jun Xie, Xing Wang, Yajuan Lü, Qun Liu","doi":"10.1109/IALP.2013.23","DOIUrl":"https://doi.org/10.1109/IALP.2013.23","url":null,"abstract":"Spoken language translation usually suffers from the missing translation of content words, failing to generate the appropriate translation. In this paper we propose a novel Mutual Information based method to improve spoken language translation by retrieving the missing translation of content words. We exploit several features that indicate how well the inner content words are translated for each rule to let MT systems select better translation rules. Experimental results show that our method can improve translation performance significantly ranging from 1.95 to 4.47 BLEU points on different test sets.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117149889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Pronominal Resolution in Tamil Using Tree CRFs 用Tree CRFs解析泰米尔语中的代词
2013 International Conference on Asian Language Processing Pub Date : 2013-08-17 DOI: 10.1109/IALP.2013.59
R. Ram, S. L. Devi
{"title":"Pronominal Resolution in Tamil Using Tree CRFs","authors":"R. Ram, S. L. Devi","doi":"10.1109/IALP.2013.59","DOIUrl":"https://doi.org/10.1109/IALP.2013.59","url":null,"abstract":"We describe our work on pronominal resolution in Tamil using Tree CRFs. Pronominal resolution is the task of identifying the referent of a pronominal. In this work we have studied third person pronouns in Tamil such as 'avan', 'aval', 'athu', 'avar', he, she, it and they respectively. Tamil is a Dravidian language and it is morphologically rich and highly agglutinative language. Tree CRFs is a machine learning method, in which the data is modeled as a graph with edge weights used for learning. The features for learning are developed by using the morphological features of the language. The work is carried out on tourism domain data from the Web. We have obtained 70.8% precision and 66.5% recall. The results are encouraging.","PeriodicalId":413833,"journal":{"name":"2013 International Conference on Asian Language Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124846380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信