Computational Linguistics最新文献

筛选
英文 中文
Abstractive Text Summarization: Enhancing Sequence-to-Sequence Models Using Word Sense Disambiguation and Semantic Content Generalization 抽象文本摘要:利用词义消歧和语义内容泛化增强序列到序列模型
IF 9.3 2区 计算机科学
Computational Linguistics Pub Date : 2021-08-05 DOI: 10.1162/coli_a_00417
P. Kouris, Georgios Alexandridis, A. Stafylopatis
{"title":"Abstractive Text Summarization: Enhancing Sequence-to-Sequence Models Using Word Sense Disambiguation and Semantic Content Generalization","authors":"P. Kouris, Georgios Alexandridis, A. Stafylopatis","doi":"10.1162/coli_a_00417","DOIUrl":"https://doi.org/10.1162/coli_a_00417","url":null,"abstract":"Abstract Nowadays, most research conducted in the field of abstractive text summarization focuses on neural-based models alone, without considering their combination with knowledge-based approaches that could further enhance their efficiency. In this direction, this work presents a novel framework that combines sequence-to-sequence neural-based text summarization along with structure and semantic-based methodologies. The proposed framework is capable of dealing with the problem of out-of-vocabulary or rare words, improving the performance of the deep learning models. The overall methodology is based on a well-defined theoretical model of knowledge-based content generalization and deep learning predictions for generating abstractive summaries. The framework is composed of three key elements: (i) a pre-processing task, (ii) a machine learning methodology, and (iii) a post-processing task. The pre-processing task is a knowledge-based approach, based on ontological knowledge resources, word sense disambiguation, and named entity recognition, along with content generalization, that transforms ordinary text into a generalized form. A deep learning model of attentive encoder-decoder architecture, which is expanded to enable a coping and coverage mechanism, as well as reinforcement learning and transformer-based architectures, is trained on a generalized version of text-summary pairs, learning to predict summaries in a generalized form. The post-processing task utilizes knowledge resources, word embeddings, word sense disambiguation, and heuristic algorithms based on text similarity methods in order to transform the generalized version of a predicted summary to a final, human-readable form. An extensive experimental procedure on three popular data sets evaluates key aspects of the proposed framework, while the obtained results exhibit promising performance, validating the robustness of the proposed approach.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"813-859"},"PeriodicalIF":9.3,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47005694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Variational Deep Logic Network for Joint Inference of Entities and Relations 实体与关系联合推理的变分深度逻辑网络
IF 9.3 2区 计算机科学
Computational Linguistics Pub Date : 2021-08-05 DOI: 10.1162/coli_a_00415
Wenya Wang, Sinno Jialin Pan
{"title":"Variational Deep Logic Network for Joint Inference of Entities and Relations","authors":"Wenya Wang, Sinno Jialin Pan","doi":"10.1162/coli_a_00415","DOIUrl":"https://doi.org/10.1162/coli_a_00415","url":null,"abstract":"Abstract Currently, deep learning models have been widely adopted and achieved promising results on various application domains. Despite their intriguing performance, most deep learning models function as black boxes, lacking explicit reasoning capabilities and explanations, which are usually essential for complex problems. Take joint inference in information extraction as an example. This task requires the identification of multiple structured knowledge from texts, which is inter-correlated, including entities, events, and the relationships between them. Various deep neural networks have been proposed to jointly perform entity extraction and relation prediction, which only propagate information implicitly via representation learning. However, they fail to encode the intensive correlations between entity types and relations to enforce their coexistence. On the other hand, some approaches adopt rules to explicitly constrain certain relational facts, although the separation of rules with representation learning usually restrains the approaches with error propagation. Moreover, the predefined rules are inflexible and might result in negative effects when data is noisy. To address these limitations, we propose a variational deep logic network that incorporates both representation learning and relational reasoning via the variational EM algorithm. The model consists of a deep neural network to learn high-level features with implicit interactions via the self-attention mechanism and a relational logic network to explicitly exploit target interactions. These two components are trained interactively to bring the best of both worlds. We conduct extensive experiments ranging from fine-grained sentiment terms extraction, end-to-end relation prediction, to end-to-end event extraction to demonstrate the effectiveness of our proposed method.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"775-812"},"PeriodicalIF":9.3,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49240733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Linear-Time Calculation of the Expected Sum of Edge Lengths in Random Projective Linearizations of Trees 树的随机投影线性化中期望边长和的线性时间计算
IF 9.3 2区 计算机科学
Computational Linguistics Pub Date : 2021-07-07 DOI: 10.1162/coli_a_00442
Lluís Alemany-Puig, R. Ferrer-i-Cancho
{"title":"Linear-Time Calculation of the Expected Sum of Edge Lengths in Random Projective Linearizations of Trees","authors":"Lluís Alemany-Puig, R. Ferrer-i-Cancho","doi":"10.1162/coli_a_00442","DOIUrl":"https://doi.org/10.1162/coli_a_00442","url":null,"abstract":"Abstract The syntactic structure of a sentence is often represented using syntactic dependency trees. The sum of the distances between syntactically related words has been in the limelight for the past decades. Research on dependency distances led to the formulation of the principle of dependency distance minimization whereby words in sentences are ordered so as to minimize that sum. Numerous random baselines have been defined to carry out related quantitative studies on lan- guages. The simplest random baseline is the expected value of the sum in unconstrained random permutations of the words in the sentence, namely, when all the shufflings of the words of a sentence are allowed and equally likely. Here we focus on a popular baseline: random projective per- mutations of the words of the sentence, that is, permutations where the syntactic dependency structure is projective, a formal constraint that sentences satisfy often in languages. Thus far, the expectation of the sum of dependency distances in random projective shufflings of a sentence has been estimated approximately with a Monte Carlo procedure whose cost is of the order of Rn, where n is the number of words of the sentence and R is the number of samples; it is well known that the larger R is, the lower the error of the estimation but the larger the time cost. Here we pre- sent formulae to compute that expectation without error in time of the order of n. Furthermore, we show that star trees maximize it, and provide an algorithm to retrieve the trees that minimize it.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"491-516"},"PeriodicalIF":9.3,"publicationDate":"2021-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47649997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Decoding Word Embeddings with Brain-Based Semantic Features 基于大脑语义特征的单词嵌入解码
IF 9.3 2区 计算机科学
Computational Linguistics Pub Date : 2021-07-05 DOI: 10.1162/coli_a_00412
Emmanuele Chersoni, Enrico Santus, Chu-Ren Huang, Alessandro Lenci
{"title":"Decoding Word Embeddings with Brain-Based Semantic Features","authors":"Emmanuele Chersoni, Enrico Santus, Chu-Ren Huang, Alessandro Lenci","doi":"10.1162/coli_a_00412","DOIUrl":"https://doi.org/10.1162/coli_a_00412","url":null,"abstract":"Word embeddings are vectorial semantic representations built with either counting or predicting techniques aimed at capturing shades of meaning from word co-occurrences. Since their introduction, these representations have been criticized for lacking interpretable dimensions. This property of word embeddings limits our understanding of the semantic features they actually encode. Moreover, it contributes to the “black box” nature of the tasks in which they are used, since the reasons for word embedding performance often remain opaque to humans. In this contribution, we explore the semantic properties encoded in word embeddings by mapping them onto interpretable vectors, consisting of explicit and neurobiologically motivated semantic features (Binder et al. 2016). Our exploration takes into account different types of embeddings, including factorized count vectors and predict models (Skip-Gram, GloVe, etc.), as well as the most recent contextualized representations (i.e., ELMo and BERT). In our analysis, we first evaluate the quality of the mapping in a retrieval task, then we shed light on the semantic features that are better encoded in each embedding type. A large number of probing tasks is finally set to assess how the original and the mapped embeddings perform in discriminating semantic categories. For each probing task, we identify the most relevant semantic features and we show that there is a correlation between the embedding performance and how they encode those features. This study sets itself as a step forward in understanding which aspects of meaning are captured by vector spaces, by proposing a new and simple method to carve human-interpretable semantic representations from distributional vectors.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"1-36"},"PeriodicalIF":9.3,"publicationDate":"2021-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41536228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Toward Gender-Inclusive Coreference Resolution: An Analysis of Gender and Bias Throughout the Machine Learning Lifecycle* 走向性别包容性的共指解决:机器学习生命周期中的性别和偏见分析*
IF 9.3 2区 计算机科学
Computational Linguistics Pub Date : 2021-07-05 DOI: 10.1162/coli_a_00413
Yang Trista Cao, Hal Daumé Iii
{"title":"Toward Gender-Inclusive Coreference Resolution: An Analysis of Gender and Bias Throughout the Machine Learning Lifecycle*","authors":"Yang Trista Cao, Hal Daumé Iii","doi":"10.1162/coli_a_00413","DOIUrl":"https://doi.org/10.1162/coli_a_00413","url":null,"abstract":"Abstract Correctly resolving textual mentions of people fundamentally entails making inferences about those people. Such inferences raise the risk of systematic biases in coreference resolution systems, including biases that can harm binary and non-binary trans and cis stakeholders. To better understand such biases, we foreground nuanced conceptualizations of gender from sociology and sociolinguistics, and investigate where in the machine learning pipeline such biases can enter a coreference resolution system. We inspect many existing data sets for trans-exclusionary biases, and develop two new data sets for interrogating bias in both crowd annotations and in existing coreference resolution systems. Through these studies, conducted on English text, we confirm that without acknowledging and building systems that recognize the complexity of gender, we will build systems that fail for: quality of service, stereotyping, and over- or under-representation, especially for binary and non-binary trans users.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"1-47"},"PeriodicalIF":9.3,"publicationDate":"2021-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44371270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Universal Dependencies 通用依赖项
IF 9.3 2区 计算机科学
Computational Linguistics Pub Date : 2021-07-01 DOI: 10.1162/coli_a_00402
Joakim Nivre, Daniel Zeman, Filip Ginter, Francis M. Tyers
{"title":"Universal Dependencies","authors":"Joakim Nivre, Daniel Zeman, Filip Ginter, Francis M. Tyers","doi":"10.1162/coli_a_00402","DOIUrl":"https://doi.org/10.1162/coli_a_00402","url":null,"abstract":"Abstract Universal dependencies (UD) is a framework for morphosyntactic annotation of human language, which to date has been used to create treebanks for more than 100 languages. In this article, we outline the linguistic theory of the UD framework, which draws on a long tradition of typologically oriented grammatical theories. Grammatical relations between words are centrally used to explain how predicate–argument structures are encoded morphosyntactically in different languages while morphological features and part-of-speech classes give the properties of words. We argue that this theory is a good basis for crosslinguistically consistent annotation of typologically diverse languages in a way that supports computational natural language understanding as well as broader linguistic studies.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"255-308"},"PeriodicalIF":9.3,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48406460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
The Taxonomy of Writing Systems: How to Measure How Logographic a System Is 书写系统的分类学:如何衡量一个系统的语源性
IF 9.3 2区 计算机科学
Computational Linguistics Pub Date : 2021-06-30 DOI: 10.1162/coli_a_00409
R. Sproat, Alexander Gutkin
{"title":"The Taxonomy of Writing Systems: How to Measure How Logographic a System Is","authors":"R. Sproat, Alexander Gutkin","doi":"10.1162/coli_a_00409","DOIUrl":"https://doi.org/10.1162/coli_a_00409","url":null,"abstract":"Taxonomies of writing systems since Gelb (1952) have classified systems based on what the written symbols represent: if they represent words or morphemes, they are logographic; if syllables, syllabic; if segments, alphabetic; and so forth. Sproat (2000) and Rogers (2005) broke with tradition by splitting the logographic and phonographic aspects into two dimensions, with logography being graded rather than a categorical distinction. A system could be syllabic, and highly logographic; or alphabetic, and mostly non-logographic. This accords better with how writing systems actually work, but neither author proposed a method for measuring logography. In this article we propose a novel measure of the degree of logography that uses an attention-based sequence-to-sequence model trained to predict the spelling of a token from its pronunciation in context. In an ideal phonographic system, the model should need to attend to only the current token in order to compute how to spell it, and this would show in the attention matrix activations. In contrast, with a logographic system, where a given pronunciation might correspond to several different spellings, the model would need to attend to a broader context. The ratio of the activation outside the token and the total activation forms the basis of our measure. We compare this with a simple lexical measure, and an entropic measure, as well as several other neural models, and argue that on balance our attention-based measure accords best with intuition about how logographic various systems are. Our work provides the first quantifiable measure of the notion of logography that accords with linguistic intuition and, we argue, provides better insight into what this notion means.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"1-52"},"PeriodicalIF":9.3,"publicationDate":"2021-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48450988","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Understanding Dialogue: Language Use and Social Interaction 理解对话:语言使用和社会互动
IF 9.3 2区 计算机科学
Computational Linguistics Pub Date : 2021-06-30 DOI: 10.1162/coli_r_00411
Rachel Bawden
{"title":"Understanding Dialogue: Language Use and Social Interaction","authors":"Rachel Bawden","doi":"10.1162/coli_r_00411","DOIUrl":"https://doi.org/10.1162/coli_r_00411","url":null,"abstract":"Understanding Dialogue: Language Use and Social Interaction represents a departure from classic theories in psycholinguistics and cognitive sciences; instead of taking as a starting point the isolated speech of an individual that can be extended to accommodate dialogue, a primary focus is put on developing a model adapted to dialogue itself, bearing in mind important aspects of dialogue as an activity with a heavily cooperative component. As a researcher of natural language processing with a background in linguistics, I find highly intriguing the possibilities provided by the dialogue model presented. Although the book does not itself touch upon the potential for automated dialogue, I am inevitably writing this review from the point of view of a computational linguist with these aspects in mind. Building on numerous previous works, including many of the authors’ own studies and theories, Understanding Dialogue presents the shared workspace framework, a framework for understanding not just dialogue but cooperative activities in general, of which dialogue is viewed as a subtype. Based on Bratman’s (1992) concept of shared cooperative activity, the framework provides a joint environment with which interlocutors can interact, both by contributing to the space (with actions or utterances for example), and by perceiving and processing their own or the other participants’ productions. The authors do not limit their work to linguistic communication: Many of their examples, particularly at the beginning of the book, are non-linguistic (e.g., hand shaking, dancing a tango, playing singles tennis); others are primarily physical, but will most likely also involve linguistic communication (such as jointly constructing flat-pack furniture); and others are purely linguistic (e.g., suggesting which restaurant to go to for lunch). The notion of alignment is highly important to this framework both from a linguistic and non-linguistic perspective, and is one of the main inspirations of the book, having previously been presented in Toward a Mechanistic Theory of Dialogue by the same authors. As individuals interact via the joint space, alignment concerns the equivalence in their representations at a conceptual level, with respect to their goals and relevant props in the shared environment (dialogue model alignment) and linguistic representations shared in the workspace (linguistic alignment). Roughly speaking, in this second (linguistic) case, this may for instance correspond to whether or not the individuals have the same representation of the utterance in terms of phonetics (were the sounds perceived correctly?) or in terms of lexical semantics (do they understand the same reference by the word uttered?). From here can be explained a number of different dialogue","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"1-3"},"PeriodicalIF":9.3,"publicationDate":"2021-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46096381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Embeddings in Natural Language Processing: Theory and Advances in Vector Representations of Meaning 自然语言处理中的嵌入:意义的矢量表示理论与进展
IF 9.3 2区 计算机科学
Computational Linguistics Pub Date : 2021-06-30 DOI: 10.1162/coli_r_00410
Marcos Garcia
{"title":"Embeddings in Natural Language Processing: Theory and Advances in Vector Representations of Meaning","authors":"Marcos Garcia","doi":"10.1162/coli_r_00410","DOIUrl":"https://doi.org/10.1162/coli_r_00410","url":null,"abstract":"Word vector representations have a long tradition in several research fields, such as cognitive science or computational linguistics. They have been used to represent the meaning of various units of natural languages, including, among others, words, phrases, and sentences. Before the deep learning tsunami, count-based vector space models had been successfully used in computational linguistics to represent the semantics of natural languages. However, the rise of neural networks in NLP popularized the use of word embeddings, which are now applied as pre-trained vectors in most machine learning architectures. This book, written by Mohammad Taher Pilehvar and Jose Camacho-Collados, provides a comprehensive and easy-to-read review of the theory and advances in vector models for NLP, focusing specially on semantic representations and their applications. It is a great introduction to different types of embeddings and the background and motivations behind them. In this sense, the authors adequately present the most relevant concepts and approaches that have been used to build vector representations. They also keep track of the most recent advances of this vibrant and fast-evolving area of research, discussing cross-lingual representations and current language models based on the Transformer. Therefore, this is a useful book for researchers interested in computational methods for semantic representations and artificial intelligence. Although some basic knowledge of machine learning may be necessary to follow a few topics, the book includes clear illustrations and explanations, which make it accessible to a wide range of readers. Apart from the preface and the conclusions, the book is organized into eight chapters. In the first two, the authors introduce some of the core ideas of NLP and artificial neural networks, respectively, discussing several concepts that will be useful throughout the book. Then, Chapters 3 to 6 present different types of vector representations at the lexical level (word embeddings, graph embeddings, sense embeddings, and contextualized embeddings), followed by a brief chapter (7) about sentence and document embeddings. For each specific topic, the book includes methods and data sets to assess the quality of the embeddings. Finally, Chapter 8 raises ethical issues involved","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":" ","pages":"1-3"},"PeriodicalIF":9.3,"publicationDate":"2021-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49166342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Sequence-Level Training for Non-Autoregressive Neural Machine Translation 非自动神经机器翻译的序列级训练
IF 9.3 2区 计算机科学
Computational Linguistics Pub Date : 2021-06-15 DOI: 10.1162/coli_a_00421
Chenze Shao, Yang Feng, Jinchao Zhang, Fandong Meng, Jie Zhou
{"title":"Sequence-Level Training for Non-Autoregressive Neural Machine Translation","authors":"Chenze Shao, Yang Feng, Jinchao Zhang, Fandong Meng, Jie Zhou","doi":"10.1162/coli_a_00421","DOIUrl":"https://doi.org/10.1162/coli_a_00421","url":null,"abstract":"Abstract In recent years, Neural Machine Translation (NMT) has achieved notable results in various translation tasks. However, the word-by-word generation manner determined by the autoregressive mechanism leads to high translation latency of the NMT and restricts its low-latency applications. Non-Autoregressive Neural Machine Translation (NAT) removes the autoregressive mechanism and achieves significant decoding speedup by generating target words independently and simultaneously. Nevertheless, NAT still takes the word-level cross-entropy loss as the training objective, which is not optimal because the output of NAT cannot be properly evaluated due to the multimodality problem. In this article, we propose using sequence-level training objectives to train NAT models, which evaluate the NAT outputs as a whole and correlates well with the real translation quality. First, we propose training NAT models to optimize sequence-level evaluation metrics (e.g., BLEU) based on several novel reinforcement algorithms customized for NAT, which outperform the conventional method by reducing the variance of gradient estimation. Second, we introduce a novel training objective for NAT models, which aims to minimize the Bag-of-N-grams (BoN) difference between the model output and the reference sentence. The BoN training objective is differentiable and can be calculated efficiently without doing any approximations. Finally, we apply a three-stage training strategy to combine these two methods to train the NAT model. We validate our approach on four translation tasks (WMT14 En↔De, WMT16 En↔Ro), which shows that our approach largely outperforms NAT baselines and achieves remarkable performance on all translation tasks. The source code is available at https://github.com/ictnlp/Seq-NAT.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"891-925"},"PeriodicalIF":9.3,"publicationDate":"2021-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45515778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信