{"title":"Gradual Modifications and Abrupt Replacements: Two Stochastic Lexical Ingredients of Language Evolution","authors":"M. Pasquini, M. Serva, D. Vergni","doi":"10.1162/coli_a_00471","DOIUrl":"https://doi.org/10.1162/coli_a_00471","url":null,"abstract":"The evolution of the vocabulary of a language is characterized by two different random processes: abrupt lexical replacements, when a complete new word emerges to represent a given concept (which was at the basis of the Swadesh foundation of glottochronology in the 1950s), and gradual lexical modifications that progressively alter words over the centuries, considered here in detail for the first time. The main discriminant between these two processes is their impact on cognacy within a family of languages or dialects, since the former modifies the subsets of cognate terms and the latter does not. The automated cognate detection, which is here performed following a new approach inspired by graph theory, is a key preliminary step that allows us to later measure the effects of the slow modification process. We test our dual approach on the family of Malagasy dialects using a cladistic analysis, which provides strong evidence that lexical replacements and gradual lexical modifications are two random processes that separately drive the evolution of languages.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"49 1","pages":"301-323"},"PeriodicalIF":9.3,"publicationDate":"2023-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48927342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Conversational AI: Dialogue Systems, Conversational Agents, and Chatbots by Michael McTear","authors":"Olga Seminck","doi":"10.1162/coli_r_00470","DOIUrl":"https://doi.org/10.1162/coli_r_00470","url":null,"abstract":"This book has appeared in the series Synthesis Lectures on Human Language Technologies: monographs from 50 up to 150 pages about specific topics subjects in computational linguistics. The intended audience of the book are researchers and graduate students in NLP, AI, and related fields. I define myself as a computational linguist; my review is from a perspective of a “random” computational linguistics researcher wanting to learn more about this topic or looking for a good guide to teach a course on dialogue systems. I found the book very easy to read and interesting and therefore I believe that McTear fully achieved his purpose to write “a readable introduction to the various concepts, issues and technologies of Conversational AI.” He succeeds remarkably well in staying on the right level of technical details, never losing the purpose of giving an overview, and the reader does not get lost in numerous details about specific algorithms. Additionally, for people who are experts in Conversational AI, the book could still be very useful because its bibliography is exceptionally complete: a very large number of early works and recent studies are cited and commented through the whole book. The book is well structured into six chapters. After an introduction, there are two chapters about specific types of dialogue systems: rule-based systems (Chapter 2) and statistical systems (Chapter 3). This is followed by a chapter about evaluation methods (Chapter 4), after which the more recent neural end-to-end systems are reviewed (Chapter 5). The book ends with a chapter on various challenges and future directions for the research on Conversational AI (Chapter 6). I found that it was meaningful to distinguish the three types of dialogue systems: rule-based systems, statistical but modular systems, and end-to-end neural systems. It might, at first, seem strange that the topic on system evaluation methods is placed between the chapter about modular statistical dialogue systems and neural end-to-end systems, but as a reader, I believe that the discussion about system evaluation comes around at the right place in the book, because it helps to better understand the difference between modular and sequence to sequence systems. In this review, I will discuss the chapters one by one in the same order as they appear in the book.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"49 1","pages":"257-259"},"PeriodicalIF":9.3,"publicationDate":"2023-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44657355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From Word Types to Tokens and Back: A Survey of Approaches to Word Meaning Representation and Interpretation","authors":"Marianna Apidianaki","doi":"10.1162/coli_a_00474","DOIUrl":"https://doi.org/10.1162/coli_a_00474","url":null,"abstract":"Vector-based word representation paradigms situate lexical meaning at different levels of abstraction. Distributional and static embedding models generate a single vector per word type, which is an aggregate across the instances of the word in a corpus. Contextual language models, on the contrary, directly capture the meaning of individual word instances. The goal of this survey is to provide an overview of word meaning representation methods, and of the strategies that have been proposed for improving the quality of the generated vectors. These often involve injecting external knowledge about lexical semantic relationships, or refining the vectors to describe different senses. The survey also covers recent approaches for obtaining word type-level representations from token-level ones, and for combining static and contextualized representations. Special focus is given to probing and interpretation studies aimed at discovering the lexical semantic knowledge that is encoded in contextualized representations. The challenges posed by this exploration have motivated the interest towards static embedding derivation from contextualized embeddings, and for methods aimed at improving the similarity estimates that can be drawn from the space of contextual language models.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"49 1","pages":"465-523"},"PeriodicalIF":9.3,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49266755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Metrological Perspective on Reproducibility in NLP*","authors":"Anya Belz","doi":"10.1162/coli_a_00448","DOIUrl":"https://doi.org/10.1162/coli_a_00448","url":null,"abstract":"Abstract Reproducibility has become an increasingly debated topic in NLP and ML over recent years, but so far, no commonly accepted definitions of even basic terms or concepts have emerged. The range of different definitions proposed within NLP/ML not only do not agree with each other, they are also not aligned with standard scientific definitions. This article examines the standard definitions of repeatability and reproducibility provided by the meta-science of metrology, and explores what they imply in terms of how to assess reproducibility, and what adopting them would mean for reproducibility assessment in NLP/ML. It turns out the standard definitions lead directly to a method for assessing reproducibility in quantified terms that renders results from reproduction studies comparable across multiple reproductions of the same original study, as well as reproductions of different original studies. The article considers where this method sits in relation to other aspects of NLP work one might wish to assess in the context of reproducibility.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"1125-1135"},"PeriodicalIF":9.3,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43287117","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Erratum: Annotation Curricula to Implicitly Train Non-Expert Annotators","authors":"Ji-Ung Lee, Jan-Christoph Klie, Iryna Gurevych","doi":"10.1162/coli_x_00469","DOIUrl":"https://doi.org/10.1162/coli_x_00469","url":null,"abstract":"Abstract The authors of this work (“Annotation Curricula to Implicitly Train Non-Expert Annotators” by Ji-Ung Lee, Jan-Christoph Klie, and Iryna Gurevych in Computational Linguistics 48:2 https://doi.org/10.1162/coli_a_00436) discovered an incorrect inequality symbol in section 5.3 (page 360). The paper stated that the differences in the annotation times for the control instances result in a p-value of 0.200 which is smaller than 0.05 (p = 0.200 < 0.05). As 0.200 is of course larger than 0.05, the correct inequality symbol is p = 0.200 > 0.05, which is in line with the conclusion that follows in the text. The paper has been updated accordingly.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"1141-1141"},"PeriodicalIF":9.3,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41520952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Enrique Amigó, Alejandro Ariza-Casabona, V. Fresno, M. A. Martí
{"title":"Information Theory–based Compositional Distributional Semantics","authors":"Enrique Amigó, Alejandro Ariza-Casabona, V. Fresno, M. A. Martí","doi":"10.1162/coli_a_00454","DOIUrl":"https://doi.org/10.1162/coli_a_00454","url":null,"abstract":"Abstract In the context of text representation, Compositional Distributional Semantics models aim to fuse the Distributional Hypothesis and the Principle of Compositionality. Text embedding is based on co-ocurrence distributions and the representations are in turn combined by compositional functions taking into account the text structure. However, the theoretical basis of compositional functions is still an open issue. In this article we define and study the notion of Information Theory–based Compositional Distributional Semantics (ICDS): (i) We first establish formal properties for embedding, composition, and similarity functions based on Shannon’s Information Theory; (ii) we analyze the existing approaches under this prism, checking whether or not they comply with the established desirable properties; (iii) we propose two parameterizable composition and similarity functions that generalize traditional approaches while fulfilling the formal properties; and finally (iv) we perform an empirical study on several textual similarity datasets that include sentences with a high and low lexical overlap, and on the similarity between words and their description. Our theoretical analysis and empirical results show that fulfilling formal properties affects positively the accuracy of text representation models in terms of correspondence (isometry) between the embedding and meaning spaces.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":" ","pages":"907-948"},"PeriodicalIF":9.3,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49351690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nucleus Composition in Transition-based Dependency Parsing","authors":"Joakim Nivre, A. Basirat, Luise Durlich, A. Moss","doi":"10.1162/coli_a_00450","DOIUrl":"https://doi.org/10.1162/coli_a_00450","url":null,"abstract":"Abstract Dependency-based approaches to syntactic analysis assume that syntactic structure can be analyzed in terms of binary asymmetric dependency relations holding between elementary syntactic units. Computational models for dependency parsing almost universally assume that an elementary syntactic unit is a word, while the influential theory of Lucien Tesnière instead posits a more abstract notion of nucleus, which may be realized as one or more words. In this article, we investigate the effect of enriching computational parsing models with a concept of nucleus inspired by Tesnière. We begin by reviewing how the concept of nucleus can be defined in the framework of Universal Dependencies, which has become the de facto standard for training and evaluating supervised dependency parsers, and explaining how composition functions can be used to make neural transition-based dependency parsers aware of the nuclei thus defined. We then perform an extensive experimental study, using data from 20 languages to assess the impact of nucleus composition across languages with different typological characteristics, and utilizing a variety of analytical tools including ablation, linear mixed-effects models, diagnostic classifiers, and dimensionality reduction. The analysis reveals that nucleus composition gives small but consistent improvements in parsing accuracy for most languages, and that the improvement mainly concerns the analysis of main predicates, nominal dependents, clausal dependents, and coordination structures. Significant factors explaining the rate of improvement across languages include entropy in coordination structures and frequency of certain function words, in particular determiners. Analysis using dimensionality reduction and diagnostic classifiers suggests that nucleus composition increases the similarity of vectors representing nuclei of the same syntactic type.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"849-886"},"PeriodicalIF":9.3,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47076502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Pretrained Transformers for Text Ranking: BERT and Beyond","authors":"S. Verberne","doi":"10.1162/coli_r_00468","DOIUrl":"https://doi.org/10.1162/coli_r_00468","url":null,"abstract":"Text ranking takes a central place in Information Retrieval (IR), with Web search as its best-known application. More generally, text ranking models are applicable to any Natural Language Processing (NLP) task in which relevance of information plays a role, from filtering and recommendation applications to question answering and semantic similarity comparisons. Since the rise of BERT in 2019, Transformer models have become the most used and studied architectures in both NLP and IR, and they have been applied to basically any task in our research fields—including text ranking. In a fast-changing research context, it can be challenging to keep lecture materials up to date. Lecturers in NLP are grateful for Dan Jurafsky and James Martin for yearly updating the 3rd edition of their textbook, making Speech and Language Processing the most comprehensive, modern textbook for NLP. The IR field is less fortunate, still relying on older textbooks, extended with a collection of recent materials that address neural models. The textbook Pretrained Transformers for Text Ranking: BERT and Beyond by Jimmy Lin, Rodrigo Nogueira, and Andrew Yates is a great effort to collect the recent developments in the use of Transformers for text ranking. The introduction of the book is well-scoped with clear guidance for the reader about topics that are out of scope (such as user aspects). This is followed by an excellent history section, stating for example:","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"49 1","pages":"253-255"},"PeriodicalIF":9.3,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46576476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Validity, Reliability, and Significance: Empirical Methods for NLP and Data Science","authors":"Richard Futrell","doi":"10.1162/coli_r_00467","DOIUrl":"https://doi.org/10.1162/coli_r_00467","url":null,"abstract":"When we come up with a new model in NLP and machine learning more generally, we usually look at some performance metric (one number), compare it against the same performance metric for a strong baseline model (one number), and if the new model gets a better number, we mark it in bold and declare it the winner. For anyone with a background in statistics or a field where conclusions must be drawn on the basis of noisy data, this procedure is frankly shocking. Suppose model A gets a BLEU score one point higher than model B: Is that difference reliable? If you used a slightly different dataset for training and evaluation, would that one point difference still hold? Would the difference even survive running the same models on the same datasets but with different random seeds? In fields such as psychology and biology, it is standard to answer such questions using standardized statistical procedures to make sure that differences of interest are larger than some quantification of measurement noise. Making a claim based on a bare difference of two numbers is unthinkable. Yet statistical procedures remain rare in the evaluation of NLP models, whose performance metrics are arguably just as noisy. To these objections, NLP practitioners can respond that they have faithfully followed the hallowed train-(dev-)test split paradigm. As long as proper test set discipline has been followed, the theory goes, the evaluation is secure: By testing on held-out data, we can be sure that our models are performing well in a way that is independent of random accidents of the training data, and by testing on that data only once, we guard against making claims based on differences that would not replicate if we ran the models again. But does the train-test split paradigm really guard against all problems of validity and reliability? Into this situation comes the book under review, Validity, Reliability, and Significance: Empirical Methods for NLP and Data Science, by Stefan Riezler and Michael Hagmann. The authors argue that the train-test split paradigm does not in fact insulate NLP from problems relating to the validity and reliability of its models, their features, and their performance metrics. They present numerous case studies to prove their point, and advocate and teach standard statistical methods as the solution, with rich examples","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"49 1","pages":"249-251"},"PeriodicalIF":9.3,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47568619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finite-State Text Processing","authors":"Aniello De Santo","doi":"10.1162/coli_r_00466","DOIUrl":"https://doi.org/10.1162/coli_r_00466","url":null,"abstract":"The rise in popularity of neural network methods in computational linguistics has led to a richness of valuable books on the topic. On the other hand, there is arguably a shortage of recent materials on foundational computational linguistics methods like finite-state technologies. This is unfortunate, as finite-state approaches not only still find much use in applications for speech and text processing (the core focus of this book), but seem also to be valuable in interpreting and improving neural methods. Moreover, the study of finite-state machines (and their corresponding formal languages) is still proving insightful in theoretical linguistics analyses. In this sense, this book by Gorman and Sproat is a welcome, refreshing contribution aimed at a variety of readers. The book is organized in eight main chapters, and can be conceptually divided into two parts. The first half of the book serves as an introduction to core concepts in formal language and automata theory (Chapter 1), the basic design principles of the Python library used through the book (Chapter 2), and a variety of finite-state algorithms (Chapters 3 and 4). The rest of the book exemplifies the formal notions of the previous chapters with practical applications of finite-state technologies to linguistic problems like morphophonological analysis and text normalization (Chapter 5, 6, 7). The last chapter (Chapter 8) presents an interesting discussion of future steps from a variety of perspectives, connecting finite-state methods to current trends in the field. In what follows, I discuss the contents of each chapter in more detail. Chapter 1 provides an accessible introduction to formal languages and automata theory, starting with a concise but helpful historical review of the development of finite-state technologies and their ties to linguistics. The chapter balances intuitive explanations of technical concepts without sacrificing formal rigor, in particular in the presentation of finite-state automata/transducers and their formal definitions. Weighted finite-state automata play a prominent role in the book and the authors introduce them algebraically through the notion of semiring (Pin 1997). This is a welcome choice that makes the book somewhat unique among introductory/application-oriented materials on these topics. Unsurprisingly, this section is the densest part of the chapter and it could have benefitted from an explanation of the intuition behind the connection between wellformedness of a string, recognition by an automaton, and paths over semirings— especially considering how relevant such concepts become when the book transitions","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"49 1","pages":"245-247"},"PeriodicalIF":9.3,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64496956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}