NEWS@IJCNLP最新文献

筛选
英文 中文
Whitepaper of NEWS 2009 Machine Transliteration Shared Task NEWS 2009机器音译共享任务白皮书
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699708
Haizhou Li, A. Kumaran, Min Zhang, V. Pervouchine
{"title":"Whitepaper of NEWS 2009 Machine Transliteration Shared Task","authors":"Haizhou Li, A. Kumaran, Min Zhang, V. Pervouchine","doi":"10.3115/1699705.1699708","DOIUrl":"https://doi.org/10.3115/1699705.1699708","url":null,"abstract":"Transliteration is defined as phonetic translation of names across languages. Transliteration of Named Entities (NEs) is necessary in many applications, such as machine translation, corpus alignment, cross-language IR, information extraction and automatic lexicon acquisition. All such systems call for high-performance transliteration, which is the focus of the shared task in the NEWS 2009 workshop. The objective of the shared task is to promote machine transliteration research by providing a common benchmarking platform for the community to evaluate the state-of-the-art technologies.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"53 3","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120885615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
DirecTL: a Language Independent Approach to Transliteration 直接:独立于语言的音译方法
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699712
Sittichai Jiampojamarn, Aditya Bhargava, Qing Dou, Kenneth Dwyer, Grzegorz Kondrak
{"title":"DirecTL: a Language Independent Approach to Transliteration","authors":"Sittichai Jiampojamarn, Aditya Bhargava, Qing Dou, Kenneth Dwyer, Grzegorz Kondrak","doi":"10.3115/1699705.1699712","DOIUrl":"https://doi.org/10.3115/1699705.1699712","url":null,"abstract":"We present DirecTL: an online discriminative sequence prediction model that employs a many-to-many alignment between target and source. Our system incorporates input segmentation, target character prediction, and sequence modeling in a unified dynamic programming framework. Experimental results suggest that DirecTL is able to independently discover many of the language-specific regularities in the training data.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128498069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 46
Combining a Two-step Conditional Random Field Model and a Joint Source Channel Model for Machine Transliteration 结合两步条件随机场模型和联合源信道模型的机器音译
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699724
D. Yang, Paul R. Dixon, Yi-Cheng Pan, T. Oonishi, Masanobu Nakamura, S. Furui
{"title":"Combining a Two-step Conditional Random Field Model and a Joint Source Channel Model for Machine Transliteration","authors":"D. Yang, Paul R. Dixon, Yi-Cheng Pan, T. Oonishi, Masanobu Nakamura, S. Furui","doi":"10.3115/1699705.1699724","DOIUrl":"https://doi.org/10.3115/1699705.1699724","url":null,"abstract":"This paper describes our system for \"NEWS 2009 Machine Transliteration Shared Task\" (NEWS 2009). We only participated in the standard run, which is a direct orthographical mapping (DOP) between two languages without using any intermediate phonemic mapping. We propose a new two-step conditional random field (CRF) model for DOP machine transliteration, in which the first CRF segments a source word into chunks and the second CRF maps the chunks to a word in the target language. The two-step CRF model obtains a slightly lower top-1 accuracy when compared to a state-of-the-art n-gram joint source-channel model. The combination of the CRF model with the joint source-channel leads to improvements in all the tasks. The official result of our system in the NEWS 2009 shared task confirms the effectiveness of our system; where we achieved 0.627 top-1 accuracy for Japanese transliterated to Japanese Kanji(JJ), 0.713 for English-to-Chinese(E2C) and 0.510 for English-to-Japanese Katakana(E2J).","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129802239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Substring-based Transliteration with Conditional Random Fields 带有条件随机场的基于子字符串的音译
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699729
S. Reddy, Sonjia Waxmonsky
{"title":"Substring-based Transliteration with Conditional Random Fields","authors":"S. Reddy, Sonjia Waxmonsky","doi":"10.3115/1699705.1699729","DOIUrl":"https://doi.org/10.3115/1699705.1699729","url":null,"abstract":"Motivated by phrase-based translation research, we present a transliteration system where characters are grouped into substrings to be mapped atomically into the target language. We show how this substring representation can be incorporated into a Conditional Random Field model that uses local context and phonemic information.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116367302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Machine Transliteration using Target-Language Grapheme and Phoneme: Multi-engine Transliteration Approach 使用目标语字素和音素的机器音译:多引擎音译方法
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699714
Jong-Hoon Oh, Kiyotaka Uchimoto, Kentaro Torisawa
{"title":"Machine Transliteration using Target-Language Grapheme and Phoneme: Multi-engine Transliteration Approach","authors":"Jong-Hoon Oh, Kiyotaka Uchimoto, Kentaro Torisawa","doi":"10.3115/1699705.1699714","DOIUrl":"https://doi.org/10.3115/1699705.1699714","url":null,"abstract":"This paper describes our approach to \"NEWS 2009 Machine Transliteration Shared Task.\" We built multiple transliteration engines based on different combinations of two transliteration models and three machine learning algorithms. Then, the outputs from these transliteration engines were combined using re-ranking functions. Our method was applied to all language pairs in \"NEWS 2009 Machine Transliteration Shared Task.\" The official results of our standard runs were ranked the best for four language pairs and the second best for three language pairs.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"35 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120896455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Fast Decoding and Easy Implementation: Transliteration as Sequential Labeling 快速解码和易于实现:作为顺序标记的音译
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699722
E. Aramaki, Takeshi Abekawa
{"title":"Fast Decoding and Easy Implementation: Transliteration as Sequential Labeling","authors":"E. Aramaki, Takeshi Abekawa","doi":"10.3115/1699705.1699722","DOIUrl":"https://doi.org/10.3115/1699705.1699722","url":null,"abstract":"Although most of previous transliteration methods are based on a generative model, this paper presents a discriminative transliteration model using conditional random fields. We regard character(s) as a kind of label, which enables us to consider a transliteration process as a sequential labeling process. This approach has two advantages: (1) fast decoding and (2) easy implementation. Experimental results yielded competitive performance, demonstrating the feasibility of the proposed approach.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131258936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A Syllable-based Name Transliteration System 基于音节的姓名音译系统
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699730
Xue Jiang, Le Sun, Dakun Zhang
{"title":"A Syllable-based Name Transliteration System","authors":"Xue Jiang, Le Sun, Dakun Zhang","doi":"10.3115/1699705.1699730","DOIUrl":"https://doi.org/10.3115/1699705.1699730","url":null,"abstract":"This paper describes the name entity transliteration system which we conducted for the \"NEWS2009 Machine Transliteration Shared Task\" (Li et al 2009). We get the transliteration in Chinese from an English name with three steps. We syllabify the English name into a sequence of syllables by some rules, and generate the most probable Pinyin sequence with the mapping model of English syllables to Pinyin (EP model), then we convert the Pinyin sequence into a Chinese character sequence with the mapping model of Pinyin to characters (PC model). And we get the final Chinese character sequence. Our system achieves an ACC of 0.498 and a Mean F-score of 0.786 in the official evaluation result.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"148 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129191635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Combining MDL Transliteration Training with Discriminative Modeling MDL音译训练与判别建模的结合
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699735
D. Zelenko
{"title":"Combining MDL Transliteration Training with Discriminative Modeling","authors":"D. Zelenko","doi":"10.3115/1699705.1699735","DOIUrl":"https://doi.org/10.3115/1699705.1699735","url":null,"abstract":"We present a transliteration system that introduces minimum description length training for transliteration and combines it with discriminative modeling. We apply the proposed approach to transliteration from English to 8 non-Latin scripts, with promising results.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122237554","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Improving Transliteration Accuracy Using Word-Origin Detection and Lexicon Lookup 利用词源检测和词典查找提高音译的准确性
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699727
Mitesh M. Khapra, P. Bhattacharyya
{"title":"Improving Transliteration Accuracy Using Word-Origin Detection and Lexicon Lookup","authors":"Mitesh M. Khapra, P. Bhattacharyya","doi":"10.3115/1699705.1699727","DOIUrl":"https://doi.org/10.3115/1699705.1699727","url":null,"abstract":"We propose a framework for transliteration which uses (i) a word-origin detection engine (pre-processing) (ii) a CRF based transliteration engine and (iii) a re-ranking model based on lexicon-lookup (post-processing). The results obtained for English-Hindi and English-Kannada transliteration show that the preprocessing and post-processing modules improve the top-1 accuracy by 7.1%.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134210103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Experiences with English-Hindi, English-Tamil and English-Kannada Transliteration Tasks at NEWS 2009 英语-印地语,英语-泰米尔语和英语-卡纳达语在新闻2009音译任务的经验
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699716
Manoj Kumar Chinnakotla, O. Damani
{"title":"Experiences with English-Hindi, English-Tamil and English-Kannada Transliteration Tasks at NEWS 2009","authors":"Manoj Kumar Chinnakotla, O. Damani","doi":"10.3115/1699705.1699716","DOIUrl":"https://doi.org/10.3115/1699705.1699716","url":null,"abstract":"We use a Phrase-Based Statistical Machine Translation approach to Transliteration where the words are replaced by characters and sentences by words. We employ the standard SMT tools like GIZA++ for learning alignments and Moses for learning the phrase tables and decoding. Besides tuning the standard SMT parameters, we focus on tuning the Character Sequence Model (CSM) related parameters like order of the CSM, weight assigned to CSM during decoding and corpus used for CSM estimation. Our results show that paying sufficient attention to CSM pays off in terms of increased transliteration accuracies.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130976042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信