NEWS@IJCNLP最新文献

筛选
英文 中文
Report of NEWS 2016 Machine Transliteration Shared Task 2016年新闻报道机器音译共享任务
NEWS@IJCNLP Pub Date : 2011-11-01 DOI: 10.18653/v1/W16-2709
Xiangyu Duan, Rafael E. Banchs, Min Zhang, Haizhou Li, A. Kumaran
{"title":"Report of NEWS 2016 Machine Transliteration Shared Task","authors":"Xiangyu Duan, Rafael E. Banchs, Min Zhang, Haizhou Li, A. Kumaran","doi":"10.18653/v1/W16-2709","DOIUrl":"https://doi.org/10.18653/v1/W16-2709","url":null,"abstract":"This report documents the Machine Transliteration Shared Task conducted as a part of the Named Entities Workshop (NEWS 2011), an IJCNLP 2011 workshop. The shared task features machine transliteration of proper names from English to 11 languages and from 3 languages to English. In total, 14 tasks are provided. 10 teams from 7 different countries participated in the evaluations. Finally, 73 standard and 4 non-standard runs are submitted, where diverse transliteration methodologies are explored and reported on the evaluation data. We report the results with 4 performance metrics. We believe that the shared task has successfully achieved its objective by providing a common benchmarking platform for the research community to evaluate the state-of-the-art technologies that benefit the future research and development.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"156 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122158499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Transliteration System Using Pair HMM with Weighted FSTs 加权fst对HMM转写系统
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699731
Peter Nabende
{"title":"Transliteration System Using Pair HMM with Weighted FSTs","authors":"Peter Nabende","doi":"10.3115/1699705.1699731","DOIUrl":"https://doi.org/10.3115/1699705.1699731","url":null,"abstract":"This paper presents a transliteration system based on pair Hidden Markov Model (pair HMM) training and Weighted Finite State Transducer (WFST) techniques. Parameters used by WFSTs for transliteration generation are learned from a pair HMM. Parameters from pair-HMM training on English-Russian data sets are found to give better transliteration quality than parameters trained for WFSTs for corresponding structures. Training a pair HMM on English vowel bigrams and standard bigrams for Cyrillic Romanization, and using a few transformation rules on generated Russian transliterations to test for context improves the system's transliteration quality.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124750956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Bridging Languages by SuperSense Entity Tagging 用超感实体标记桥接语言
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699740
Davide Picca, A. Gliozzo, S. Campora
{"title":"Bridging Languages by SuperSense Entity Tagging","authors":"Davide Picca, A. Gliozzo, S. Campora","doi":"10.3115/1699705.1699740","DOIUrl":"https://doi.org/10.3115/1699705.1699740","url":null,"abstract":"This paper explores a very basic linguistic phenomenon in multilingualism: the lexicalizations of entities are very often identical within different languages while concepts are usually lexicalized differently. Since entities are commonly referred to by proper names in natural language, we measured their distribution in the lexical overlap of the terminologies extracted from comparable corpora. Results show that the lexical overlap is mostly composed by unambiguous words, which can be regarded as anchors to bridge languages: most of terms having the same spelling refer exactly to the same entities. Thanks to this important feature of Named Entities, we developed a multilingual super sense tagging system capable to distinguish between concepts and individuals. Individuals adopted for training have been extracted both by YAGO and by a heuristic procedure. The general F1 of the English tagger is over 76%, which is in line with the state of the art on super sense tagging while augmenting the number of classes. Performances for Italian are slightly lower, while ensuring a reasonable accuracy level which is capable to show effective results for knowledge acquisition.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130231223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Voted NER System using Appropriate Unlabeled Data 使用适当未标记数据的投票NER系统
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699749
Asif Ekbal, Sivaji Bandyopadhyay
{"title":"Voted NER System using Appropriate Unlabeled Data","authors":"Asif Ekbal, Sivaji Bandyopadhyay","doi":"10.3115/1699705.1699749","DOIUrl":"https://doi.org/10.3115/1699705.1699749","url":null,"abstract":"This paper reports a voted Named Entity Recognition (NER) system with the use of appropriate unlabeled data. The proposed method is based on the classifiers such as Maximum Entropy (ME), Conditional Random Field (CRF) and Support Vector Machine (SVM) and has been tested for Bengali. The system makes use of the language independent features in the form of different contextual and orthographic word level features along with the language dependent features extracted from the Part of Speech (POS) tagger and gazetteers. Context patterns generated from the unlabeled data using an active learning method have been used as the features in each of the classifiers. A semi-supervised method has been used to describe the measures to automatically select effective documents and sentences from unlabeled data. Finally, the models have been combined together into a final system by weighted voting technique. Experimental results show the effectiveness of the proposed approach with the overall Recall, Precision, and F-Score values of 93.81%, 92.18% and 92.98%, respectively. We have shown how the language dependent features can improve the system performance.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"284 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122087723","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Chinese-English Organization Name Translation Based on Correlative Expansion 基于关联展开的汉英机构名称翻译
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699741
Feiliang Ren, Muhua Zhu, Huizhen Wang, Jingbo Zhu
{"title":"Chinese-English Organization Name Translation Based on Correlative Expansion","authors":"Feiliang Ren, Muhua Zhu, Huizhen Wang, Jingbo Zhu","doi":"10.3115/1699705.1699741","DOIUrl":"https://doi.org/10.3115/1699705.1699741","url":null,"abstract":"This paper presents an approach to translating Chinese organization names into English based on correlative expansion. Firstly, some candidate translations are generated by using statistical translation method. And several correlative named entities for the input are retrieved from a correlative named entity list. Secondly, three kinds of expansion methods are used to generate some expanded queries. Finally, these queries are submitted to a search engine, and the refined translation results are mined and re-ranked by using the returned web pages. Experimental results show that this approach outperforms the compared system in overall translation accuracy.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"118 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127991607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Modeling Machine Transliteration as a Phrase Based Statistical Machine Translation Problem 基于短语的机器音译建模统计机器翻译问题
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699737
Taraka Rama, Karthik Gali
{"title":"Modeling Machine Transliteration as a Phrase Based Statistical Machine Translation Problem","authors":"Taraka Rama, Karthik Gali","doi":"10.3115/1699705.1699737","DOIUrl":"https://doi.org/10.3115/1699705.1699737","url":null,"abstract":"In this paper we use the popular phrase-based SMT techniques for the task of machine transliteration, for English-Hindi language pair. Minimum error rate training has been used to learn the model weights. We have achieved an accuracy of 46.3% on the test set. Our results show these techniques can be successfully used for the task of machine transliteration.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"6 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121009063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Transliteration of Name Entity via Improved Statistical Translation on Character Sequences 基于字符序列改进统计翻译的名称实体音译
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699720
Yan Song, C. Kit, Xiao Chen
{"title":"Transliteration of Name Entity via Improved Statistical Translation on Character Sequences","authors":"Yan Song, C. Kit, Xiao Chen","doi":"10.3115/1699705.1699720","DOIUrl":"https://doi.org/10.3115/1699705.1699720","url":null,"abstract":"Transliteration of given parallel name entities can be formulated as a phrase-based statistical machine translation (SMT) process, via its routine procedure comprising training, optimization and decoding. In this paper, we present our approach to transliterating name entities using the loglinear phrase-based SMT on character sequences. Our proposed work improves the translation by using bidirectional models, plus some heuristic guidance integrated in the decoding process. Our evaluated results indicate that this approach performs well in all standard runs in the NEWS2009 Machine Transliteration Shared Task.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133028668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
epsilon-extension Hidden Markov Models and Weighted Transducers for Machine Transliteration 机器音译的扩展隐马尔可夫模型和加权换能器
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699736
Balakrishnan Varadarajan, D. Rao
{"title":"epsilon-extension Hidden Markov Models and Weighted Transducers for Machine Transliteration","authors":"Balakrishnan Varadarajan, D. Rao","doi":"10.3115/1699705.1699736","DOIUrl":"https://doi.org/10.3115/1699705.1699736","url":null,"abstract":"We describe in detail a method for transliterating an English string to a foreign language string evaluated on five different languages, including Tamil, Hindi, Russian, Chinese, and Kannada. Our method involves deriving substring alignments from the training data and learning a weighted finite state transducer from these alignments. We define an e-extension Hidden Markov Model to derive alignments between training pairs and a heuristic to extract the substring alignments. Our method involves only two tunable parameters that can be optimized on held-out data.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128853159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Language-Independent Transliteration Schema Using Character Aligned Models at NEWS 2009 一种使用字符对齐模型的非语言转写图式[j]
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699715
Praneeth Shishtla, S. Veeravalli, Sethuramalingam Subramaniam, Vasudeva Varma
{"title":"A Language-Independent Transliteration Schema Using Character Aligned Models at NEWS 2009","authors":"Praneeth Shishtla, S. Veeravalli, Sethuramalingam Subramaniam, Vasudeva Varma","doi":"10.3115/1699705.1699715","DOIUrl":"https://doi.org/10.3115/1699705.1699715","url":null,"abstract":"In this paper we present a statistical transliteration technique that is language independent. This technique uses statistical alignment models and Conditional Random Fields (CRF). Statistical alignment models maximizes the probability of the observed (source, target) word pairs using the expectation maximization algorithm and then the character level alignments are set to maximum posterior predictions of the model. CRF has efficient training and decoding processes which is conditioned on both source and target languages and produces globally optimal solution.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132750960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
English to Hindi Machine Transliteration System at NEWS 2009 英语到印地语机器音译系统在新闻2009
NEWS@IJCNLP Pub Date : 2009-08-07 DOI: 10.3115/1699705.1699726
Amitava Das, Asif Ekbal, Tapabrata Mondal, Sivaji Bandyopadhyay
{"title":"English to Hindi Machine Transliteration System at NEWS 2009","authors":"Amitava Das, Asif Ekbal, Tapabrata Mondal, Sivaji Bandyopadhyay","doi":"10.3115/1699705.1699726","DOIUrl":"https://doi.org/10.3115/1699705.1699726","url":null,"abstract":"This paper reports about our work in the NEWS 2009 Machine Transliteration Shared Task held as part of ACL-IJCNLP 2009. We submitted one standard run and two non-standard runs for English to Hindi transliteration. The modified joint source-channel model has been used along with a number of alternatives. The system has been trained on the NEWS 2009 Machine Transliteration Shared Task datasets. For standard run, the system demonstrated an accuracy of 0.471 and the mean F-Score of 0.861. The non-standard runs yielded the accuracy and mean F-scores of 0.389 and 0.831 respectively in the first one and 0.384 and 0.828 respectively in the second one. The non-standard runs resulted in substantially worse performance than the standard run. The reasons for this are the ranking algorithm used for the output and the types of tokens present in the test set.","PeriodicalId":262513,"journal":{"name":"NEWS@IJCNLP","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133847673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信