A. Kalouli, Hai Hu, Alexander F. Webb, Larry Moss, Valeria C V de Paiva
{"title":"Curing the SICK and Other NLI Maladies","authors":"A. Kalouli, Hai Hu, Alexander F. Webb, Larry Moss, Valeria C V de Paiva","doi":"10.1162/coli_a_00465","DOIUrl":"https://doi.org/10.1162/coli_a_00465","url":null,"abstract":"Against the backdrop of the ever-improving Natural Language Inference (NLI) models, recent efforts have focused on the suitability of the current NLI datasets and on the feasibility of the NLI task as it is currently approached. Many of the recent studies have exposed the inherent human disagreements of the inference task and have proposed a shift from categorical labels to human subjective probability assessments, capturing human uncertainty. In this work, we show how neither the current task formulation nor the proposed uncertainty gradient are entirely suitable for solving the NLI challenges. Instead, we propose an ordered sense space annotation, which distinguishes between logical and common-sense inference. One end of the space captures non-sensical inferences, while the other end represents strictly logical scenarios. In the middle of the space, we find a continuum of common-sense, namely, the subjective and graded opinion of a “person on the street.” To arrive at the proposed annotation scheme, we perform a careful investigation of the SICK corpus and we create a taxonomy of annotation issues and guidelines. We re-annotate the corpus with the proposed annotation scheme, utilizing four symbolic inference systems, and then perform a thorough evaluation of the scheme by fine-tuning and testing commonly used pre-trained language models on the re-annotated SICK within various settings. We also pioneer a crowd annotation of a small portion of the MultiNLI corpus, showcasing that it is possible to adapt our scheme for annotation by non-experts on another NLI corpus. Our work shows the efficiency and benefits of the proposed mechanism and opens the way for a careful NLI task refinement.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"49 1","pages":"199-243"},"PeriodicalIF":9.3,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48889363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Martha Palmer and Barbara Di Eugenio Interview Martha Evens","authors":"Martha Evens","doi":"10.1162/coli_a_00453","DOIUrl":"https://doi.org/10.1162/coli_a_00453","url":null,"abstract":"strategies and student behaviors, including differences between face-to-face and computer-mediated tutoring sessions; the usage of hinting and of analogies on the part of the tutor; taking initiative on the part of the students; and several domain-based teaching techniques, for example, at which level of knowledge to teach. All of these strategies were implemented, and several were evaluated in careful experiments. CIRCSIM-Tutor was shown to engender significant learning gains, and was used in actual classes, which is even more striking since the NLP technologies available at the time were severely limited. For further details, please see Di Eugenio et al. 3 ]","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"765-773"},"PeriodicalIF":9.3,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48552327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Explainable Natural Language Processing","authors":"G. Chrysostomou","doi":"10.1162/coli_r_00460","DOIUrl":"https://doi.org/10.1162/coli_r_00460","url":null,"abstract":"Explainable Natural Language Processing (NLP) is an emerging field, which has received significant attention from the NLP community in the last few years. At its core is the need to explain the predictions of machine learning models, now more frequently deployed and used in sensitive areas such as healthcare and law. The rapid developments in the area of explainable NLP have led to somewhat disconnected groups of studies working on these areas. This disconnect results in researchers adopting various definitions for similar problems, while also in certain cases enabling the re-creation of previous research, highlighting the need for a unified framework for explainable NLP. Written by Anders Søgaard, this book provides the author’s convincing view of how we should first define explanations, and, secondly, how we should categorize explanations and the approaches that generate them, creating first and foremost a taxonomy and a unified framework for explainable NLP. As per the author, this will make it easier to relate studies and explanation methodologies in this field, with the aim of accelerating research. It is a brilliant book for both researchers starting to explore explainable NLP, but also for researchers with experience in this area, as it provides a holistic up-to-date view of the explainable NLP at the local and global level. The author conveniently and logically presents each chapter as a “problem” of explainable NLP, as such providing also a taxonomy of explainable NLP problem areas and current approaches to tackle them. Under each chapter, explanation methods are described in detail, beginning initially with “foundational” approaches (e.g., vanilla gradients) and building toward more complex ones (e.g., integrated gradients). To complement the theory and make this into a complete guide to explainable NLP, the author also describes evaluation approaches and provides a list of datasets and code repositories. As such, although the book requires some basic knowledge of NLP and Machine Learning to get started, it is nevertheless accessible to a large audience. This book is organized into thirteen chapters. In the first chapter the author introduces the problems associated with previously proposed taxonomies for explainable NLP. Chapter 2 follows by introducing popular machine learning architectures used in NLP, while also introducing the explanation taxonomy proposed in the book. Chapters","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"1137-1139"},"PeriodicalIF":9.3,"publicationDate":"2022-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47611632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How Much Does Lookahead Matter for Disambiguation? Partial Arabic Diacritization Case Study","authors":"Saeed Esmail, Kfir Bar, N. Dershowitz","doi":"10.1162/coli_a_00456","DOIUrl":"https://doi.org/10.1162/coli_a_00456","url":null,"abstract":"Abstract We suggest a model for partial diacritization of deep orthographies. We focus on Arabic, where the optional indication of selected vowels by means of diacritics can resolve ambiguity and improve readability. Our partial diacritizer restores short vowels only when they contribute to the ease of understandability during reading a given running text. The idea is to identify those uncertainties of absent vowels that require the reader to look ahead to disambiguate. To achieve this, two independent neural networks are used for predicting diacritics, one that takes the entire sentence as input and another that considers only the text that has been read thus far. Partial diacritization is then determined by retaining precisely those vowels on which the two networks disagree, preferring the reading based on consideration of the whole sentence over the more naïve reading-order diacritization. For evaluation, we prepared a new dataset of Arabic texts with both full and partial vowelization. In addition to facilitating readability, we find that our partial diacritizer improves translation quality compared either to their total absence or to random selection. Lastly, we study the benefit of knowing the text that follows the word in focus toward the restoration of short vowels during reading, and we measure the degree to which lookahead contributes to resolving ambiguities encountered while reading. L’Herbelot had asserted, that the most ancient Korans, written in the Cufic character, had no vowel points; and that these were first invented by Jahia–ben Jamer, who died in the 127th year of the Hegira. “Toderini’s History of Turkish Literature,” Analytical Review (1789)","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"1103-1123"},"PeriodicalIF":9.3,"publicationDate":"2022-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45679521","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Assela Reig-Alamillo, David Torres-Moreno, Eliseo Morales-González, Mauricio Toledo-Acosta, Antoine Taroni, Jorge Hermosillo Valadez
{"title":"The Analysis of Synonymy and Antonymy in Discourse Relations: An Interpretable Modeling Approach","authors":"Assela Reig-Alamillo, David Torres-Moreno, Eliseo Morales-González, Mauricio Toledo-Acosta, Antoine Taroni, Jorge Hermosillo Valadez","doi":"10.1162/coli_a_00477","DOIUrl":"https://doi.org/10.1162/coli_a_00477","url":null,"abstract":"The idea that discourse relations are interpreted both by explicit content and by shared knowledge between producer and interpreter is pervasive in discourse and linguistic studies. How much weight should be ascribed in this process to the lexical semantics of the arguments is, however, uncertain. We propose a computational approach to analyze contrast and concession relations in the PDTB corpus. Our work sheds light on the question of how much lexical relations contribute to the signaling of such explicit and implicit relations, as well as on the contribution of different parts of speech to these semantic relations. This study contributes to bridging the gap between corpus and computational linguistics by proposing transparent and explainable computational models of discourse relations based on the synonymy and antonymy of their arguments.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"49 1","pages":"429-464"},"PeriodicalIF":9.3,"publicationDate":"2022-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43896109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Enrique Amigó, Alejandro Ariza-Casabona, V. Fresno, M. A. Martí
{"title":"Information Theory–based Compositional Distributional Semantics","authors":"Enrique Amigó, Alejandro Ariza-Casabona, V. Fresno, M. A. Martí","doi":"10.1162/_","DOIUrl":"https://doi.org/10.1162/_","url":null,"abstract":"Abstract In the context of text representation, Compositional Distributional Semantics models aim to fuse the Distributional Hypothesis and the Principle of Compositionality. Text embedding is based on co-ocurrence distributions and the representations are in turn combined by compositional functions taking into account the text structure. However, the theoretical basis of compositional functions is still an open issue. In this article we define and study the notion of Information Theory–based Compositional Distributional Semantics (ICDS): (i) We first establish formal properties for embedding, composition, and similarity functions based on Shannon’s Information Theory; (ii) we analyze the existing approaches under this prism, checking whether or not they comply with the established desirable properties; (iii) we propose two parameterizable composition and similarity functions that generalize traditional approaches while fulfilling the formal properties; and finally (iv) we perform an empirical study on several textual similarity datasets that include sentences with a high and low lexical overlap, and on the similarity between words and their description. Our theoretical analysis and empirical results show that fulfilling formal properties affects positively the accuracy of text representation models in terms of correspondence (isometry) between the embedding and meaning spaces.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"907-948"},"PeriodicalIF":9.3,"publicationDate":"2022-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47598974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effective Approaches to Neural Query Language Identification","authors":"Xingzhang Ren, Baosong Yang, Dayiheng Liu, Haibo Zhang, Xiaoyu Lv, Liang Yao, Jun Xie","doi":"10.1162/coli_a_00451","DOIUrl":"https://doi.org/10.1162/coli_a_00451","url":null,"abstract":"Abstract Query language identification (Q-LID) plays a crucial role in a cross-lingual search engine. There exist two main challenges in Q-LID: (1) insufficient contextual information in queries for disambiguation; and (2) the lack of query-style training examples for low-resource languages. In this article, we propose a neural Q-LID model by alleviating the above problems from both model architecture and data augmentation perspectives. Concretely, we build our model upon the advanced Transformer model. In order to enhance the discrimination of queries, a variety of external features (e.g., character, word, as well as script) are fed into the model and fused by a multi-scale attention mechanism. Moreover, to remedy the low resource challenge in this task, a novel machine translation–based strategy is proposed to automatically generate synthetic query-style data for low-resource languages. We contribute the first Q-LID test set called QID-21, which consists of search queries in 21 languages. Experimental results reveal that our model yields better classification accuracy than strong baselines and existing LID systems on both query and traditional LID tasks.1","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"887-906"},"PeriodicalIF":9.3,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47340984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kasidis Kanwatchara, Thanapapas Horsuwan, Piyawat Lertvittayakumjorn, B. Kijsirikul, P. Vateekul
{"title":"Enhancing Lifelong Language Learning by Improving Pseudo-Sample Generation","authors":"Kasidis Kanwatchara, Thanapapas Horsuwan, Piyawat Lertvittayakumjorn, B. Kijsirikul, P. Vateekul","doi":"10.1162/coli_a_00449","DOIUrl":"https://doi.org/10.1162/coli_a_00449","url":null,"abstract":"Abstract To achieve lifelong language learning, pseudo-rehearsal methods leverage samples generated from a language model to refresh the knowledge of previously learned tasks. Without proper controls, however, these methods could fail to retain the knowledge of complex tasks with longer texts since most of the generated samples are low in quality. To overcome the problem, we propose three specific contributions. First, we utilize double language models, each of which specializes in a specific part of the input, to produce high-quality pseudo samples. Second, we reduce the number of parameters used by applying adapter modules to enhance training efficiency. Third, we further improve the overall quality of pseudo samples using temporal ensembling and sample regeneration. The results show that our framework achieves significant improvement over baselines on multiple task sequences. Also, our pseudo sample analysis reveals helpful insights for designing even better pseudo-rehearsal methods in the future.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"819-848"},"PeriodicalIF":9.3,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46164760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Dimensional Modeling of Emotions in Text with Appraisal Theories: Corpus Creation, Annotation Reliability, and Prediction","authors":"Enrica Troiano, Laura Oberländer, Roman Klinger","doi":"10.1162/coli_a_00461","DOIUrl":"https://doi.org/10.1162/coli_a_00461","url":null,"abstract":"The most prominent tasks in emotion analysis are to assign emotions to texts and to understand how emotions manifest in language. An important observation for natural language processing is that emotions can be communicated implicitly by referring to events alone, appealing to an empathetic, intersubjective understanding of events, even without explicitly mentioning an emotion name. In psychology, the class of emotion theories known as appraisal theories aims at explaining the link between events and emotions. Appraisals can be formalized as variables that measure a cognitive evaluation by people living through an event that they consider relevant. They include the assessment if an event is novel, if the person considers themselves to be responsible, if it is in line with their own goals, and so forth. Such appraisals explain which emotions are developed based on an event, for example, that a novel situation can induce surprise or one with uncertain consequences could evoke fear. We analyze the suitability of appraisal theories for emotion analysis in text with the goal of understanding if appraisal concepts can reliably be reconstructed by annotators, if they can be predicted by text classifiers, and if appraisal concepts help to identify emotion categories. To achieve that, we compile a corpus by asking people to textually describe events that triggered particular emotions and to disclose their appraisals. Then, we ask readers to reconstruct emotions and appraisals from the text. This set-up allows us to measure if emotions and appraisals can be recovered purely from text and provides a human baseline to judge a model’s performance measures. Our comparison of text classification methods to human annotators shows that both can reliably detect emotions and appraisals with similar performance. Therefore, appraisals constitute an alternative computational emotion analysis paradigm and further improve the categorization of emotions in text with joint models.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"49 1","pages":"1-72"},"PeriodicalIF":9.3,"publicationDate":"2022-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43777842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}