2011 International Conference on Asian Language Processing最新文献

筛选
英文 中文
Optimal Translation Boundaries for BTG-Based Decoding 基于btg的译码的最佳翻译边界
2011 International Conference on Asian Language Processing Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.73
Xiangyu Duan, Min Zhang
{"title":"Optimal Translation Boundaries for BTG-Based Decoding","authors":"Xiangyu Duan, Min Zhang","doi":"10.1109/IALP.2011.73","DOIUrl":"https://doi.org/10.1109/IALP.2011.73","url":null,"abstract":"This paper proposes a method for inducing translation boundaries as soft constraints for Bracketing Transduction Grammar based (BTG-based) decoding. Translation boundaries used in previous research are extracted from left-most synchronous trees generated by a deterministic algorithm. Translation boundaries in this research are extracted from induced synchronous trees, which are statistically optimal and more balanced than the left-most synchronous trees. Experiments show that induced translation boundaries are more consistent than those extracted from left-most synchronous trees, resulting in significantly better performances over the strong baseline.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127657113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Character-Level System Combination: An Empirical Study for English-to-Chinese Spoken Language Translation 字符级系统组合:英汉口语翻译的实证研究
2011 International Conference on Asian Language Processing Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.47
Jinhua Du
{"title":"Character-Level System Combination: An Empirical Study for English-to-Chinese Spoken Language Translation","authors":"Jinhua Du","doi":"10.1109/IALP.2011.47","DOIUrl":"https://doi.org/10.1109/IALP.2011.47","url":null,"abstract":"This paper proposes a character-level system combination strategy for English -- Chinese spoken language translation. For languages like Chinese that the word boundaries are not orthographically marked, word segmentation which segments a Chinese sentence into a sequence of words, is often required for many Natural Language Processing tasks. In this paper we evaluate the impact of segmentation (spoken data) on the performance of system combination, and show that using inappropriate segmentation in system combination can result in inferior performance compared to single systems. We further demonstrate that using characters as basic translation unit in system combination on IWSLT ASR translation task leads to significant gains in translation quality in terms of BLEU and NIST scores.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115303455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Simplified-Traditional Chinese Character Conversion Model Based on Log-Linear Models 基于对数线性模型的简体繁体汉字转换模型
2011 International Conference on Asian Language Processing Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.15
Yidong Chen, X. Shi, Changle Zhou
{"title":"A Simplified-Traditional Chinese Character Conversion Model Based on Log-Linear Models","authors":"Yidong Chen, X. Shi, Changle Zhou","doi":"10.1109/IALP.2011.15","DOIUrl":"https://doi.org/10.1109/IALP.2011.15","url":null,"abstract":"With the growth of exchange activities between four regions of cross strait, the problem to correctly convert between Traditional Chinese (TC) and Simplified Chinese (SC) become more and more important. Numerous one-to-many mappings and term usage differences make it more difficult to convert from SC to TC. This paper proposed a novel simplified-traditional Chinese character conversion model based on log-linear models, in which features such as language models and lexical semantic consistency weighs are integrated. When estimating lexical semantic consistency weighs, cross-language word-based semantic spaces were used. Experiments were conducted and the results show that the proposed model achieve better performance.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121699211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Automatic Acquisition of Chinese-Tibetan Multi-word Equivalent Pair from Bilingual Corpora 双语语料库汉藏多词对等对的自动习得
2011 International Conference on Asian Language Processing Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.33
Minghua Nuo, Huidan Liu, Long-Long Ma, Jian Wu, Zhiming Ding
{"title":"Automatic Acquisition of Chinese-Tibetan Multi-word Equivalent Pair from Bilingual Corpora","authors":"Minghua Nuo, Huidan Liu, Long-Long Ma, Jian Wu, Zhiming Ding","doi":"10.1109/IALP.2011.33","DOIUrl":"https://doi.org/10.1109/IALP.2011.33","url":null,"abstract":"This paper aims to construct Chinese-Tibetan multi-word equivalent pair dictionary for Chinese-Tibetan computer-aided translation system. Since Tibetan is a morphologically rich language, we propose two-phase framework to automatically extract multi-word equivalent pairs. First, extract Chinese Multi-word Units (MWUs). In this phase, we propose CBEM model to partition a Chinese sentence into MWUs using two measures of collocation and binding degree. Second, get Tibetan translations of the extracted Chinese MWUs. In the second phase, we propose TSIM model to focus on extracting 1-to-n bilingual MWUs. Preliminary experimental results show that the mixed method combining CBEM model with TSIM model is effective.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"51 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129312543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Chinese-English Bilingual Sentence Alignment Based on Length 基于长度的汉英双语句子对齐
2011 International Conference on Asian Language Processing Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.70
Huafu Ding, Li Quan, Haoliang Qi
{"title":"The Chinese-English Bilingual Sentence Alignment Based on Length","authors":"Huafu Ding, Li Quan, Haoliang Qi","doi":"10.1109/IALP.2011.70","DOIUrl":"https://doi.org/10.1109/IALP.2011.70","url":null,"abstract":"Bilingual sentence pairs are key resource for statistical machine translation. Currently, most of the sentence alignment corpus is between English and French or English and German. And there is little specialized sentence alignment dataset between English and Chinese. So our aim is to create large-scale, high-precision English-Chinese aligned sentences. Length based method is used to align bilingual paragraphs which were extracted from CNKI (China National Knowledge Infrastructure). CNKI is one of largest academic website, and contains huge Chinese-English bilingual paragraph. Our method adapts and combines some approaches, which are based on words and based on hybrid. At last, we choose the best alignment by dynamic programming. The experiments on CNKI dataset showed that the presented method had satisfactory the recall ratio and the precision ratio.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126657878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Natural Language Grammar Induction of Indonesian Language Corpora Using Genetic Algorithm 基于遗传算法的印尼语语料库自然语言语法归纳
2011 International Conference on Asian Language Processing Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.58
Ary Hermawan, Gunawan, Joan Santoso
{"title":"Natural Language Grammar Induction of Indonesian Language Corpora Using Genetic Algorithm","authors":"Ary Hermawan, Gunawan, Joan Santoso","doi":"10.1109/IALP.2011.58","DOIUrl":"https://doi.org/10.1109/IALP.2011.58","url":null,"abstract":"Grammar Induction is a machine learning process for learning grammar from corpora. This paper will discuss the process of grammar induction for Indonesian language corpora using genetic algorithm. The Grammar production rules will be modeled in the form of chromosomes. The fitness function is used to count how many sentences can be parsed. The data used are Indonesian fairy tales stories such as \"Bawang Merah Bawang Putih\" and \"Malin Kundang\". This paper describes the detailed explanations about the steps of each process carried out for natural language grammar problems.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126516917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Automatic Labeling and Phonetic Assessment for an Unknown Asian Language: The Case of the "Mo Piu" North Vietnamese Minority (early results) 一种未知亚洲语言的自动标注与语音评估:以北越“莫痞”少数民族为例(早期结果)
2011 International Conference on Asian Language Processing Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.81
G. Caelen-Haumont, Sam Sethserey, E. Castelli
{"title":"Automatic Labeling and Phonetic Assessment for an Unknown Asian Language: The Case of the \"Mo Piu\" North Vietnamese Minority (early results)","authors":"G. Caelen-Haumont, Sam Sethserey, E. Castelli","doi":"10.1109/IALP.2011.81","DOIUrl":"https://doi.org/10.1109/IALP.2011.81","url":null,"abstract":"This paper aims at assessing the automatic labeling of an undocumented, unknown and underresourced unwritten language (Mo Piu) of the North Vietnam, by an expert phonetician. For this task, we chose 5 languages in different combinations in order to highlight the best set. Two assessments will be presented, first, that of the phonetic events, and secondly that of the languages sets. After the presentation of the methods used for the automatic labeling and recognition, the paper will focus on the assessment of the phonetic units and of the languages sets.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116801777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Theoretical Framework of Mongolian Word Segmentation Specification for Information Processing 面向信息处理的蒙古语分词规范理论框架
2011 International Conference on Asian Language Processing Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.45
T. Laga, Xiaobing Zhao
{"title":"Theoretical Framework of Mongolian Word Segmentation Specification for Information Processing","authors":"T. Laga, Xiaobing Zhao","doi":"10.1109/IALP.2011.45","DOIUrl":"https://doi.org/10.1109/IALP.2011.45","url":null,"abstract":"The establishment of Contemporary Mongolian word segmentation specification for information processing has a great significance in the standardization of information processing, the compatibleness of different systems, the sharing of corpus, grammatical analysis, and POS tagging. The present paper studies the framework of Mongolian word segmentation including guidelines, formulating principles, styles, scopes of segmentation units, establishment foundation, structure of the specification and so on, and lays the theoretical foundation for this specification.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128331151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Developing Bengali Speech Corpus for Phone Recognizer Using Optimum Text Selection Technique 利用最佳文本选择技术开发孟加拉语语音语料库
2011 International Conference on Asian Language Processing Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.16
S. Mandal, B. Das, Pabitra Mitra, A. Basu
{"title":"Developing Bengali Speech Corpus for Phone Recognizer Using Optimum Text Selection Technique","authors":"S. Mandal, B. Das, Pabitra Mitra, A. Basu","doi":"10.1109/IALP.2011.16","DOIUrl":"https://doi.org/10.1109/IALP.2011.16","url":null,"abstract":"Speech corpus plays a key role in construction of automatic speech recognition (ASR), text-to-speech (TTS) synthesis and phone recognition (PR) system. PR system and ASR system are quite similar in functionality. The difference between these two is that for PR system the speech signal is converted to phonefootnote{smallest discrete segment of sound in uttered speech} text whereas for ASR system the speech signal is converted to word text. Speech corpus for PR system usually consists of a text corpus, recording data corresponding to the text corpus, phonetic representation of the text corpus and a pronunciation dictionary. Selecting optimum text from available text with balanced phone distribution is an important task for developing high quality PR system. In this paper, we describe our text selection technique and discuss the performance of phone recognition system.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134491378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Improving Bilingual Lexicon Construction from Chinese-English Comparable Corpora via Dependency Relationship Mapping 基于依存关系映射的英汉可比语料库双语词汇构建研究
2011 International Conference on Asian Language Processing Pub Date : 2011-11-15 DOI: 10.1109/IALP.2011.22
Hua Xu, Dandan Liu, Longhua Qian, Guodong Zhou
{"title":"Improving Bilingual Lexicon Construction from Chinese-English Comparable Corpora via Dependency Relationship Mapping","authors":"Hua Xu, Dandan Liu, Longhua Qian, Guodong Zhou","doi":"10.1109/IALP.2011.22","DOIUrl":"https://doi.org/10.1109/IALP.2011.22","url":null,"abstract":"Currently context-based approach is a popular approach for constructing bilingual lexicons from comparable corpora. Following this line of research, this paper proposes a dependency relationship mapping model and investigates its effect on bilingual lexicon construction. The experiments show that, by mapping context words, dependency relationship types and directions simultaneously when calculating the similarity between two words in the source and target languages respectively, our approach significantly outperforms a state-of-the-art system in bilingual lexicon construction from either Chinese-English or English-Chinese. This justifies the effectiveness of our dependency relationship mapping model on bilingual lexicon construction.","PeriodicalId":297167,"journal":{"name":"2011 International Conference on Asian Language Processing","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127710907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信