{"title":"Improved N-Best Extraction with an Evaluation on Language Data","authors":"Johanna Björklund, F. Drewes, Anna Jonsson","doi":"10.1162/coli_a_00427","DOIUrl":"https://doi.org/10.1162/coli_a_00427","url":null,"abstract":"We show that a previously proposed algorithm for the N-best trees problem can be made more efficient by changing how it arranges and explores the search space. Given an integer N and a weighted tree automaton (wta) M over the tropical semiring, the algorithm computes N trees of minimal weight with respect to M. Compared with the original algorithm, the modifications increase the laziness of the evaluation strategy, which makes the new algorithm asymptotically more efficient than its predecessor. The algorithm is implemented in the software Betty, and compared to the state-of-the-art algorithm for extracting the N best runs, implemented in the software toolkit Tiburon. The data sets used in the experiments are wtas resulting from real-world natural language processing tasks, as well as artificially created wtas with varying degrees of nondeterminism. We find that Betty outperforms Tiburon on all tested data sets with respect to running time, while Tiburon seems to be the more memory-efficient choice.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"119-153"},"PeriodicalIF":9.3,"publicationDate":"2021-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43997136","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Revisiting the Boundary between ASR and NLU in the Age of Conversational Dialog Systems","authors":"Manaal Faruqui, Dilek Z. Hakkani-Tür","doi":"10.1162/coli_a_00430","DOIUrl":"https://doi.org/10.1162/coli_a_00430","url":null,"abstract":"As more users across the world are interacting with dialog agents in their daily life, there is a need for better speech understanding that calls for renewed attention to the dynamics between research in automatic speech recognition (ASR) and natural language understanding (NLU). We briefly review these research areas and lay out the current relationship between them. In light of the observations we make in this article, we argue that (1) NLU should be cognizant of the presence of ASR models being used upstream in a dialog system’s pipeline, (2) ASR should be able to learn from errors found in NLU, (3) there is a need for end-to-end data sets that provide semantic annotations on spoken input, (4) there should be stronger collaboration between ASR and NLU research communities.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"221-232"},"PeriodicalIF":9.3,"publicationDate":"2021-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49216893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"To Augment or Not to Augment? A Comparative Study on Text Augmentation Techniques for Low-Resource NLP","authors":"Gözde Gül Şahin","doi":"10.1162/coli_a_00425","DOIUrl":"https://doi.org/10.1162/coli_a_00425","url":null,"abstract":"Abstract Data-hungry deep neural networks have established themselves as the de facto standard for many NLP tasks, including the traditional sequence tagging ones. Despite their state-of-the-art performance on high-resource languages, they still fall behind their statistical counterparts in low-resource scenarios. One methodology to counterattack this problem is text augmentation, that is, generating new synthetic training data points from existing data. Although NLP has recently witnessed several new textual augmentation techniques, the field still lacks a systematic performance analysis on a diverse set of languages and sequence tagging tasks. To fill this gap, we investigate three categories of text augmentation methodologies that perform changes on the syntax (e.g., cropping sub-sentences), token (e.g., random word insertion), and character (e.g., character swapping) levels. We systematically compare the methods on part-of-speech tagging, dependency parsing, and semantic role labeling for a diverse set of language families using various models, including the architectures that rely on pretrained multilingual contextualized language models such as mBERT. Augmentation most significantly improves dependency parsing, followed by part-of-speech tagging and semantic role labeling. We find the experimented techniques to be effective on morphologically rich languages in general rather than analytic languages such as Vietnamese. Our results suggest that the augmentation techniques can further improve over strong baselines based on mBERT, especially for dependency parsing. We identify the character-level methods as the most consistent performers, while synonym replacement and syntactic augmenters provide inconsistent improvements. Finally, we discuss that the results most heavily depend on the task, language pair (e.g., syntactic-level techniques mostly benefit higher-level tasks and morphologically richer languages), and model type (e.g., token-level augmentation provides significant improvements for BPE, while character-level ones give generally higher scores for char and mBERT based models).","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"5-42"},"PeriodicalIF":9.3,"publicationDate":"2021-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49107971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Natural Language Processing and Computational Linguistics","authors":"Jun'ichi Tsujii","doi":"10.1162/coli_a_00420","DOIUrl":"https://doi.org/10.1162/coli_a_00420","url":null,"abstract":"away other aspects of information, such as the speaker’s empathy, distinction of old/new information, emphasis, and so on. To climb up the hierarchy led to loss of information in lower levels of representation. In Tsujii (1986), instead of mapping at the abstract level, I proposed “transfer based on a bundle of features of all the levels”, in which the transfer would refer to all levels of representation in the source language to produce a corresponding representation in the target language (Figure 4). Because different levels of representation require different geometrical structures (i.e., different tree structures), the realization of this proposal had to wait for development of a clear mathematical formulation of feature-based 6 IS (Interface Structure) is dependent on a specific language. In particular, unlike the interlingual approach, Eurotra did not assume language-independent leximemes in ISs so that the transfer phase between the two ISs (source and target ISs) was indispensable. See footnote 5. 711 D ow naded rom httpdirect.m it.edu/coli/article-p7/1979478/coli_a_00420.pdf by gest on 04 M arch 2022 Computational Linguistics Volume 47, Number 4 Figure 4 Description-based transfer (Tsujii 1986). representation with reentrancy, which allowed multiple levels (i.e., multiple trees) to be represented with their mutual relationships (see the next section). Another idea we adopted to systematize the transfer phase was recursive transfer (Nagao and Tsujii 1986), which was inspired by the idea of compositional semantics in CL. According to the views of linguists at the time, a language is an infinite set of expressions which, in turn, is defined by a finite set of rules. By applying this finite number of rules, one can generate infinitely many grammatical sentences of the language. Compositional semantics claimed that the meaning of a phrase was determined by combining the meanings of its subphrases, using the rules that generated the phrase. Compositional translation applied the same idea to translation. That is, the translation of a phrase was determined by combining the translations of its subphrases. In this way, translations of infinitely many sentences of the source language could be generated. Using the compositional translation approach, the translation of a sentence would be undertaken by recursively tracing a tree structure of a source sentence. The translation of a phrase would then be formulated by combining the translations of its subphrases. That is, translation would be constructed in a bottom up manner, from smaller units of translation to larger units. Furthermore, because the mapping of a phrase from the source to the target would be determined by the lexical head of the phrase, the lexical entry for the head word specified how to map a phrase to the target. In the MU project, we called this lexicondriven, recursive transfer (Nagao and Tsujii 1986) (Figure 5). 712 D ow naded rom httpdirect.m it.edu/coli/article-p7/1979478/coli_","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"707-727"},"PeriodicalIF":9.3,"publicationDate":"2021-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44009399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Natural Language Processing: A Machine Learning Perspective by Yue Zhang and Zhiyang Teng","authors":"Julia Ive","doi":"10.1162/coli_r_00423","DOIUrl":"https://doi.org/10.1162/coli_r_00423","url":null,"abstract":"","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"233-235"},"PeriodicalIF":9.3,"publicationDate":"2021-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42794314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ethics Sheet for Automatic Emotion Recognition and Sentiment Analysis","authors":"Saif M. Mohammad","doi":"10.1162/coli_a_00433","DOIUrl":"https://doi.org/10.1162/coli_a_00433","url":null,"abstract":"Abstract The importance and pervasiveness of emotions in our lives makes affective computing a tremendously important and vibrant line of work. Systems for automatic emotion recognition (AER) and sentiment analysis can be facilitators of enormous progress (e.g., in improving public health and commerce) but also enablers of great harm (e.g., for suppressing dissidents and manipulating voters). Thus, it is imperative that the affective computing community actively engage with the ethical ramifications of their creations. In this article, I have synthesized and organized information from AI Ethics and Emotion Recognition literature to present fifty ethical considerations relevant to AER. Notably, this ethics sheet fleshes out assumptions hidden in how AER is commonly framed, and in the choices often made regarding the data, method, and evaluation. Special attention is paid to the implications of AER on privacy and social groups. Along the way, key recommendations are made for responsible AER. The objective of the ethics sheet is to facilitate and encourage more thoughtfulness on why to automate, how to automate, and how to judge success well before the building of AER systems. Additionally, the ethics sheet acts as a useful introductory document on emotion recognition (complementing survey articles).","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"239-278"},"PeriodicalIF":9.3,"publicationDate":"2021-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48330840","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Haddow, Rachel Bawden, Antonio Valerio Miceli Barone, Jindvrich Helcl, Alexandra Birch
{"title":"Survey of Low-Resource Machine Translation","authors":"B. Haddow, Rachel Bawden, Antonio Valerio Miceli Barone, Jindvrich Helcl, Alexandra Birch","doi":"10.1162/coli_a_00446","DOIUrl":"https://doi.org/10.1162/coli_a_00446","url":null,"abstract":"Abstract We present a survey covering the state of the art in low-resource machine translation (MT) research. There are currently around 7,000 languages spoken in the world and almost all language pairs lack significant resources for training machine translation models. There has been increasing interest in research addressing the challenge of producing useful translation models when very little translated training data is available. We present a summary of this topical research field and provide a description of the techniques evaluated by researchers in several recent shared tasks in low-resource MT.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"673-732"},"PeriodicalIF":9.3,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43974444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fernando Alva-Manchego, Carolina Scarton, Lucia Specia
{"title":"The (Un)Suitability of Automatic Evaluation Metrics for Text Simplification","authors":"Fernando Alva-Manchego, Carolina Scarton, Lucia Specia","doi":"10.1162/coli_a_00418","DOIUrl":"https://doi.org/10.1162/coli_a_00418","url":null,"abstract":"Abstract In order to simplify sentences, several rewriting operations can be performed, such as replacing complex words per simpler synonyms, deleting unnecessary information, and splitting long sentences. Despite this multi-operation nature, evaluation of automatic simplification systems relies on metrics that moderately correlate with human judgments on the simplicity achieved by executing specific operations (e.g., simplicity gain based on lexical replacements). In this article, we investigate how well existing metrics can assess sentence-level simplifications where multiple operations may have been applied and which, therefore, require more general simplicity judgments. For that, we first collect a new and more reliable data set for evaluating the correlation of metrics and human judgments of overall simplicity. Second, we conduct the first meta-evaluation of automatic metrics in Text Simplification, using our new data set (and other existing data) to analyze the variation of the correlation between metrics’ scores and human judgments across three dimensions: the perceived simplicity level, the system type, and the set of references used for computation. We show that these three aspects affect the correlations and, in particular, highlight the limitations of commonly used operation-specific metrics. Finally, based on our findings, we propose a set of recommendations for automatic evaluation of multi-operation simplifications, suggesting which metrics to compute and how to interpret their scores.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"861-889"},"PeriodicalIF":9.3,"publicationDate":"2021-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45077149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LFG Generation from Acyclic F-Structures is NP-Hard","authors":"Jürgen Wedekind, R. Kaplan","doi":"10.1162/coli_a_00419","DOIUrl":"https://doi.org/10.1162/coli_a_00419","url":null,"abstract":"Abstract The universal generation problem for LFG grammars is the problem of determining whether a given grammar derives any terminal string with a given f-structure. It is known that this problem is decidable for acyclic f-structures. In this brief note, we show that for those f-structures the problem is nonetheless intractable. This holds even for grammars that are off-line parsable.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"939-946"},"PeriodicalIF":9.3,"publicationDate":"2021-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45076327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Are Ellipses Important for Machine Translation?","authors":"Payal Khullar","doi":"10.1162/coli_a_00414","DOIUrl":"https://doi.org/10.1162/coli_a_00414","url":null,"abstract":"Abstract This article describes an experiment to evaluate the impact of different types of ellipses discussed in theoretical linguistics on Neural Machine Translation (NMT), using English to Hindi/Telugu as source and target languages. Evaluation with manual methods shows that most of the errors made by Google NMT are located in the clause containing the ellipsis, the frequency of such errors is slightly more in Telugu than Hindi, and the translation adequacy shows improvement when ellipses are reconstructed with their antecedents. These findings not only confirm the importance of ellipses and their resolution for MT, but also hint toward a possible correlation between the translation of discourse devices like ellipses with the morphological incongruity of the source and target. We also observe that not all ellipses are translated poorly and benefit from reconstruction, advocating for a disparate treatment of different ellipses in MT research.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"47 1","pages":"927-937"},"PeriodicalIF":9.3,"publicationDate":"2021-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64495124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}