NEWS@ACLPub Date : 2018-07-20DOI: 10.18653/v1/W18-2407
Zhongwei Li, Xuancong Wang, AiTi Aw, Chng Eng Siong, Haizhou Li
{"title":"Named-Entity Tagging and Domain adaptation for Better Customized Translation","authors":"Zhongwei Li, Xuancong Wang, AiTi Aw, Chng Eng Siong, Haizhou Li","doi":"10.18653/v1/W18-2407","DOIUrl":"https://doi.org/10.18653/v1/W18-2407","url":null,"abstract":"Customized translation need pay spe-cial attention to the target domain ter-minology especially the named-entities for the domain. Adding linguistic features to neural machine translation (NMT) has been shown to benefit translation in many studies. In this paper, we further demonstrate that adding named-entity (NE) feature with named-entity recognition (NER) into the source language produces better translation with NMT. Our experiments show that by just including the different NE classes and boundary tags, we can increase the BLEU score by around 1 to 2 points using the standard test sets from WMT2017. We also show that adding NE tags using NER and applying in-domain adaptation can be combined to further improve customized machine translation.","PeriodicalId":189654,"journal":{"name":"NEWS@ACL","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131992497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NEWS@ACLPub Date : 2018-07-01DOI: 10.18653/v1/W18-2408
Nancy F. Chen, Xiangyu Duan, Min Zhang, Rafael E. Banchs, Haizhou Li
{"title":"NEWS 2018 Whitepaper","authors":"Nancy F. Chen, Xiangyu Duan, Min Zhang, Rafael E. Banchs, Haizhou Li","doi":"10.18653/v1/W18-2408","DOIUrl":"https://doi.org/10.18653/v1/W18-2408","url":null,"abstract":"Transliteration is defined as phonetic translation of names across languages. Transliteration of Named Entities (NEs) is necessary in many applications, such as machine translation, corpus alignment, cross-language IR, information extraction and automatic lexicon acquisition. All such systems call for high-performance transliteration, which is the focus of shared task in the NEWS 2018 workshop. The objective of the shared task is to promote machine transliteration research by providing a common benchmarking platform for the community to evaluate the state-of-the-art technologies.","PeriodicalId":189654,"journal":{"name":"NEWS@ACL","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115203903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NEWS@ACLPub Date : 2018-07-01DOI: 10.18653/v1/W18-2406
Christian Hardmeier, L. Bevacqua, S. Loáiciga, H. Rohde
{"title":"Forms of Anaphoric Reference to Organisational Named Entities: Hoping to widen appeal, they diversified","authors":"Christian Hardmeier, L. Bevacqua, S. Loáiciga, H. Rohde","doi":"10.18653/v1/W18-2406","DOIUrl":"https://doi.org/10.18653/v1/W18-2406","url":null,"abstract":"Proper names of organisations are a special case of collective nouns. Their meaning can be conceptualised as a collective unit or as a plurality of persons, allowing for different morphological marking of coreferent anaphoric pronouns. This paper explores the variability of references to organisation names with 1) a corpus analysis and 2) two crowd-sourced story continuation experiments. The first shows that the preference for singular vs. plural conceptualisation is dependent on the level of formality of a text. In the second, we observe a strong preference for the plural they otherwise typical of informal speech. Using edited corpus data instead of constructed sentences as stimuli reduces this preference.","PeriodicalId":189654,"journal":{"name":"NEWS@ACL","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124571826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NEWS@ACLPub Date : 2018-07-01DOI: 10.18653/v1/W18-2411
Soumyadeep Kundu, Sayantan Paul, Santanu Pal
{"title":"A Deep Learning Based Approach to Transliteration","authors":"Soumyadeep Kundu, Sayantan Paul, Santanu Pal","doi":"10.18653/v1/W18-2411","DOIUrl":"https://doi.org/10.18653/v1/W18-2411","url":null,"abstract":"In this paper, we propose different architectures for language independent machine transliteration which is extremely important for natural language processing (NLP) applications. Though a number of statistical models for transliteration have already been proposed in the past few decades, we proposed some neural network based deep learning architectures for the transliteration of named entities. Our transliteration systems adapt two different neural machine translation (NMT) frameworks: recurrent neural network and convolutional sequence to sequence based NMT. It is shown that our method provides quite satisfactory results when it comes to multi lingual machine transliteration. Our submitted runs are an ensemble of different transliteration systems for all the language pairs. In the NEWS 2018 Shared Task on Transliteration, our method achieves top performance for the En–Pe and Pe–En language pairs and comparable results for other cases.","PeriodicalId":189654,"journal":{"name":"NEWS@ACL","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122284692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NEWS@ACLPub Date : 2018-07-01DOI: 10.18653/v1/W18-2403
E. Inan, Oğuz Dikenelli
{"title":"A Sequence Learning Method for Domain-Specific Entity Linking","authors":"E. Inan, Oğuz Dikenelli","doi":"10.18653/v1/W18-2403","DOIUrl":"https://doi.org/10.18653/v1/W18-2403","url":null,"abstract":"Recent collective Entity Linking studies usually promote global coherence of all the mapped entities in the same document by using semantic embeddings and graph-based approaches. Although graph-based approaches are shown to achieve remarkable results, they are computationally expensive for general datasets. Also, semantic embeddings only indicate relatedness between entity pairs without considering sequences. In this paper, we address these problems by introducing a two-fold neural model. First, we match easy mention-entity pairs and using the domain information of this pair to filter candidate entities of closer mentions. Second, we resolve more ambiguous pairs using bidirectional Long Short-Term Memory and CRF models for the entity disambiguation. Our proposed system outperforms state-of-the-art systems on the generated domain-specific evaluation dataset.","PeriodicalId":189654,"journal":{"name":"NEWS@ACL","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123236810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NEWS@ACLPub Date : 2018-07-01DOI: 10.18653/v1/W18-2412
Saeed Najafi, B. Hauer, Rashed Rubby Riyadh, Leyuan Yu, Grzegorz Kondrak
{"title":"Comparison of Assorted Models for Transliteration","authors":"Saeed Najafi, B. Hauer, Rashed Rubby Riyadh, Leyuan Yu, Grzegorz Kondrak","doi":"10.18653/v1/W18-2412","DOIUrl":"https://doi.org/10.18653/v1/W18-2412","url":null,"abstract":"We report the results of our experiments in the context of the NEWS 2018 Shared Task on Transliteration. We focus on the comparison of several diverse systems, including three neural MT models. A combination of discriminative, generative, and neural models obtains the best results on the development sets. We also put forward ideas for improving the shared task.","PeriodicalId":189654,"journal":{"name":"NEWS@ACL","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127295857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NEWS@ACLPub Date : 2018-07-01DOI: 10.18653/v1/W18-2409
Nancy F. Chen, Rafael E. Banchs, Min Zhang, Xiangyu Duan, Haizhou Li
{"title":"Report of NEWS 2018 Named Entity Transliteration Shared Task","authors":"Nancy F. Chen, Rafael E. Banchs, Min Zhang, Xiangyu Duan, Haizhou Li","doi":"10.18653/v1/W18-2409","DOIUrl":"https://doi.org/10.18653/v1/W18-2409","url":null,"abstract":"This report presents the results from the Named Entity Transliteration Shared Task conducted as part of The Seventh Named Entities Workshop (NEWS 2018) held at ACL 2018 in Melbourne, Australia. Similar to previous editions of NEWS, the Shared Task featured 19 tasks on proper name transliteration, including 13 different languages and two different Japanese scripts. A total of 6 teams from 8 different institutions participated in the evaluation, submitting 424 runs, involving different transliteration methodologies. Four performance metrics were used to report the evaluation results. The NEWS shared task on machine transliteration has successfully achieved its objectives by providing a common ground for the research community to conduct comparative evaluations of state-of-the-art technologies that will benefit the future research and development in this area.","PeriodicalId":189654,"journal":{"name":"NEWS@ACL","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130473907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NEWS@ACLPub Date : 2018-07-01DOI: 10.18653/v1/W18-2404
Jiewen Wu, Rafael E. Banchs, L. F. D’Haro, Pavitra Krishnaswamy, Nancy F. Chen
{"title":"Attention-based Semantic Priming for Slot-filling","authors":"Jiewen Wu, Rafael E. Banchs, L. F. D’Haro, Pavitra Krishnaswamy, Nancy F. Chen","doi":"10.18653/v1/W18-2404","DOIUrl":"https://doi.org/10.18653/v1/W18-2404","url":null,"abstract":"The problem of sequence labelling in language understanding would benefit from approaches inspired by semantic priming phenomena. We propose that an attention-based RNN architecture can be used to simulate semantic priming for sequence labelling. Specifically, we employ pre-trained word embeddings to characterize the semantic relationship between utterances and labels. We validate the approach using varying sizes of the ATIS and MEDIA datasets, and show up to 1.4-1.9% improvement in F1 score. The developed framework can enable more explainable and generalizable spoken language understanding systems.","PeriodicalId":189654,"journal":{"name":"NEWS@ACL","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121187725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NEWS@ACLPub Date : 2018-07-01DOI: 10.18653/v1/W18-2413
Roman Grundkiewicz, Kenneth Heafield
{"title":"Neural Machine Translation Techniques for Named Entity Transliteration","authors":"Roman Grundkiewicz, Kenneth Heafield","doi":"10.18653/v1/W18-2413","DOIUrl":"https://doi.org/10.18653/v1/W18-2413","url":null,"abstract":"Transliterating named entities from one language into another can be approached as neural machine translation (NMT) problem, for which we use deep attentional RNN encoder-decoder models. To build a strong transliteration system, we apply well-established techniques from NMT, such as dropout regularization, model ensembling, rescoring with right-to-left models, and back-translation. Our submission to the NEWS 2018 Shared Task on Named Entity Transliteration ranked first in several tracks.","PeriodicalId":189654,"journal":{"name":"NEWS@ACL","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115462510","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
NEWS@ACLPub Date : 2018-07-01DOI: 10.18653/v1/W18-2410
Snigdha Singhania, Minh Nguyen, H. Ngo, Nancy Chen
{"title":"Statistical Machine Transliteration Baselines for NEWS 2018","authors":"Snigdha Singhania, Minh Nguyen, H. Ngo, Nancy Chen","doi":"10.18653/v1/W18-2410","DOIUrl":"https://doi.org/10.18653/v1/W18-2410","url":null,"abstract":"This paper reports the results of our trans-literation experiments conducted on NEWS 2018 Shared Task dataset. We focus on creating the baseline systems trained using two open-source, statistical transliteration tools, namely Sequitur and Moses. We discuss the pre-processing steps performed on this dataset for both the systems. We also provide a re-ranking system which uses top hypotheses from Sequitur and Moses to create a consolidated list of transliterations. The results obtained from each of these models can be used to present a good starting point for the participating teams.","PeriodicalId":189654,"journal":{"name":"NEWS@ACL","volume":"18 05","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114106842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}