Transactions of the Association for Computational Linguistics最新文献

筛选
英文 中文
Transparency Helps Reveal When Language Models Learn Meaning 透明度有助于揭示语言模型何时学习意义
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2022-10-14 DOI: 10.1162/tacl_a_00565
Zhaofeng Wu, Will Merrill, Hao Peng, Iz Beltagy, Noah A. Smith
{"title":"Transparency Helps Reveal When Language Models Learn Meaning","authors":"Zhaofeng Wu, Will Merrill, Hao Peng, Iz Beltagy, Noah A. Smith","doi":"10.1162/tacl_a_00565","DOIUrl":"https://doi.org/10.1162/tacl_a_00565","url":null,"abstract":"Many current NLP systems are built from language models trained to optimize unsupervised objectives on large amounts of raw text. Under what conditions might such a procedure acquire meaning? Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations (i.e., languages with strong transparency), both autoregressive and masked language models successfully learn to emulate semantic relations between expressions. However, when denotations are changed to be context-dependent with the language otherwise unmodified, this ability degrades. Turning to natural language, our experiments with a specific phenomenon—referential opacity—add to the growing body of evidence that current language models do not represent natural language semantics well. We show this failure relates to the context-dependent nature of natural language form-meaning mappings.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"617-634"},"PeriodicalIF":10.9,"publicationDate":"2022-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41764344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Explainable Abuse Detection as Intent Classification and Slot Filling 可解释的滥用检测:意图分类和槽填充
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2022-10-06 DOI: 10.1162/tacl_a_00527
Agostina Calabrese, Björn Ross, Mirella Lapata
{"title":"Explainable Abuse Detection as Intent Classification and Slot Filling","authors":"Agostina Calabrese, Björn Ross, Mirella Lapata","doi":"10.1162/tacl_a_00527","DOIUrl":"https://doi.org/10.1162/tacl_a_00527","url":null,"abstract":"Abstract To proactively offer social media users a safe online experience, there is a need for systems that can detect harmful posts and promptly alert platform moderators. In order to guarantee the enforcement of a consistent policy, moderators are provided with detailed guidelines. In contrast, most state-of-the-art models learn what abuse is from labeled examples and as a result base their predictions on spurious cues, such as the presence of group identifiers, which can be unreliable. In this work we introduce the concept of policy-aware abuse detection, abandoning the unrealistic expectation that systems can reliably learn which phenomena constitute abuse from inspecting the data alone. We propose a machine-friendly representation of the policy that moderators wish to enforce, by breaking it down into a collection of intents and slots. We collect and annotate a dataset of 3,535 English posts with such slots, and show how architectures for intent classification and slot filling can be used for abuse detection, while providing a rationale for model decisions.1","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"10 1","pages":"1440-1454"},"PeriodicalIF":10.9,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46558658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Domain-Specific Word Embeddings with Structure Prediction 具有结构预测的领域特定词嵌入
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2022-10-06 DOI: 10.1162/tacl_a_00538
Stephanie Brandl, D. Lassner, A. Baillot, S. Nakajima
{"title":"Domain-Specific Word Embeddings with Structure Prediction","authors":"Stephanie Brandl, D. Lassner, A. Baillot, S. Nakajima","doi":"10.1162/tacl_a_00538","DOIUrl":"https://doi.org/10.1162/tacl_a_00538","url":null,"abstract":"Complementary to finding good general word embeddings, an important question for representation learning is to find dynamic word embeddings, for example, across time or domain. Current methods do not offer a way to use or predict information on structure between sub-corpora, time or domain and dynamic embeddings can only be compared after post-alignment. We propose novel word embedding methods that provide general word representations for the whole corpus, domain- specific representations for each sub-corpus, sub-corpus structure, and embedding alignment simultaneously. We present an empirical evaluation on New York Times articles and two English Wikipedia datasets with articles on science and philosophy. Our method, called Word2Vec with Structure Prediction (W2VPred), provides better performance than baselines in terms of the general analogy tests, domain-specific analogy tests, and multiple specific word embedding evaluations as well as structure prediction performance when no structure is given a priori. As a use case in the field of Digital Humanities we demonstrate how to raise novel research questions for high literature from the German Text Archive.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"320-335"},"PeriodicalIF":10.9,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43780471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering 面向开放领域问答的检索增强生成(RAG)模型的领域适应性改进
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2022-10-06 DOI: 10.1162/tacl_a_00530
Shamane Siriwardhana, Rivindu Weerasekera, Elliott Wen, Tharindu Kaluarachchi, R. Rana, Suranga Nanayakkara
{"title":"Improving the Domain Adaptation of Retrieval Augmented Generation (RAG) Models for Open Domain Question Answering","authors":"Shamane Siriwardhana, Rivindu Weerasekera, Elliott Wen, Tharindu Kaluarachchi, R. Rana, Suranga Nanayakkara","doi":"10.1162/tacl_a_00530","DOIUrl":"https://doi.org/10.1162/tacl_a_00530","url":null,"abstract":"Retrieval Augment Generation (RAG) is a recent advancement in Open-Domain Question Answering (ODQA). RAG has only been trained and explored with a Wikipedia-based external knowledge base and is not optimized for use in other specialized domains such as healthcare and news. In this paper, we evaluate the impact of joint training of the retriever and generator components of RAG for the task of domain adaptation in ODQA. We propose RAG-end2end, an extension to RAG that can adapt to a domain-specific knowledge base by updating all components of the external knowledge base during training. In addition, we introduce an auxiliary training signal to inject more domain-specific knowledge. This auxiliary signal forces RAG-end2end to reconstruct a given sentence by accessing the relevant information from the external knowledge base. Our novel contribution is that, unlike RAG, RAG-end2end does joint training of the retriever and generator for the end QA task and domain adaptation. We evaluate our approach with datasets from three domains: COVID-19, News, and Conversations, and achieve significant performance improvements compared to the original RAG model. Our work has been open-sourced through the HuggingFace Transformers library, attesting to our work’s credibility and technical consistency.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"55 1","pages":"1-17"},"PeriodicalIF":10.9,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64440765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
FRMT: A Benchmark for Few-Shot Region-Aware Machine Translation 少镜头区域感知机器翻译的基准
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2022-10-01 DOI: 10.1162/tacl_a_00568
Parker Riley, Timothy Dozat, Jan A. Botha, Xavier García, Dan Garrette, Jason Riesa, Orhan Firat, Noah Constant
{"title":"FRMT: A Benchmark for Few-Shot Region-Aware Machine Translation","authors":"Parker Riley, Timothy Dozat, Jan A. Botha, Xavier García, Dan Garrette, Jason Riesa, Orhan Firat, Noah Constant","doi":"10.1162/tacl_a_00568","DOIUrl":"https://doi.org/10.1162/tacl_a_00568","url":null,"abstract":"We present FRMT, a new dataset and evaluation benchmark for Few-shot Region-aware Machine Translation, a type of style-targeted translation. The dataset consists of professional translations from English into two regional variants each of Portuguese and Mandarin Chinese. Source documents are selected to enable detailed analysis of phenomena of interest, including lexically distinct terms and distractor terms. We explore automatic evaluation metrics for FRMT and validate their correlation with expert human evaluation across both region-matched and mismatched rating scenarios. Finally, we present a number of baseline models for this task, and offer guidelines for how researchers can train, evaluate, and compare their own models. Our dataset and evaluation code are publicly available: https://bit.ly/frmt-task.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"671-685"},"PeriodicalIF":10.9,"publicationDate":"2022-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47872341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Meta-Learning a Cross-lingual Manifold for Semantic Parsing 元学习——用于语义分析的跨语言流形
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2022-09-26 DOI: 10.1162/tacl_a_00533
Tom Sherborne, Mirella Lapata
{"title":"Meta-Learning a Cross-lingual Manifold for Semantic Parsing","authors":"Tom Sherborne, Mirella Lapata","doi":"10.1162/tacl_a_00533","DOIUrl":"https://doi.org/10.1162/tacl_a_00533","url":null,"abstract":"Localizing a semantic parser to support new languages requires effective cross-lingual generalization. Recent work has found success with machine-translation or zero-shot methods, although these approaches can struggle to model how native speakers ask questions. We consider how to effectively leverage minimal annotated examples in new languages for few-shot cross-lingual semantic parsing. We introduce a first-order meta-learning algorithm to train a semantic parser with maximal sample efficiency during cross-lingual transfer. Our algorithm uses high-resource languages to train the parser and simultaneously optimizes for cross-lingual generalization to lower-resource languages. Results across six languages on ATIS demonstrate that our combination of generalization steps yields accurate semantic parsers sampling ≤10% of source training data in each new language. Our approach also trains a competitive model on Spider using English with generalization to Chinese similarly sampling ≤10% of training data.1","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"49-67"},"PeriodicalIF":10.9,"publicationDate":"2022-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43271061","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
OPAL: Ontology-Aware Pretrained Language Model for End-to-End Task-Oriented Dialogue 面向端到端任务对话的本体感知预训练语言模型
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2022-09-10 DOI: 10.1162/tacl_a_00534
Zhi Chen, Yuncong Liu, Lu Chen, Su Zhu, Mengyue Wu, Kai Yu
{"title":"OPAL: Ontology-Aware Pretrained Language Model for End-to-End Task-Oriented Dialogue","authors":"Zhi Chen, Yuncong Liu, Lu Chen, Su Zhu, Mengyue Wu, Kai Yu","doi":"10.1162/tacl_a_00534","DOIUrl":"https://doi.org/10.1162/tacl_a_00534","url":null,"abstract":"This paper presents an ontology-aware pretrained language model (OPAL) for end-to-end task-oriented dialogue (TOD). Unlike chit-chat dialogue models, task-oriented dialogue models fulfill at least two task-specific modules: Dialogue state tracker (DST) and response generator (RG). The dialogue state consists of the domain-slot-value triples, which are regarded as the user’s constraints to search the domain-related databases. The large-scale task-oriented dialogue data with the annotated structured dialogue state usually are inaccessible. It prevents the development of the pretrained language model for the task-oriented dialogue. We propose a simple yet effective pretraining method to alleviate this problem, which consists of two pretraining phases. The first phase is to pretrain on large-scale contextual text data, where the structured information of the text is extracted by the information extracting tool. To bridge the gap between the pretraining method and downstream tasks, we design two pretraining tasks: ontology-like triple recovery and next-text generation, which simulates the DST and RG, respectively. The second phase is to fine-tune the pretrained model on the TOD data. The experimental results show that our proposed method achieves an exciting boost and obtains competitive performance even without any TOD data on CamRest676 and MultiWOZ benchmarks.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"68-84"},"PeriodicalIF":10.9,"publicationDate":"2022-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44597473","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Investigating Reasons for Disagreement in Natural Language Inference 探究自然语言推理中分歧的原因
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2022-09-07 DOI: 10.1162/tacl_a_00523
Nan Jiang, M. Marneffe
{"title":"Investigating Reasons for Disagreement in Natural Language Inference","authors":"Nan Jiang, M. Marneffe","doi":"10.1162/tacl_a_00523","DOIUrl":"https://doi.org/10.1162/tacl_a_00523","url":null,"abstract":"Abstract We investigate how disagreement in natural language inference (NLI) annotation arises. We developed a taxonomy of disagreement sources with 10 categories spanning 3 high- level classes. We found that some disagreements are due to uncertainty in the sentence meaning, others to annotator biases and task artifacts, leading to different interpretations of the label distribution. We explore two modeling approaches for detecting items with potential disagreement: a 4-way classification with a “Complicated” label in addition to the three standard NLI labels, and a multilabel classification approach. We found that the multilabel classification is more expressive and gives better recall of the possible interpretations in the data.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"10 1","pages":"1357-1374"},"PeriodicalIF":10.9,"publicationDate":"2022-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43852538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Adapting to the Long Tail: A Meta-Analysis of Transfer Learning Research for Language Understanding Tasks. 适应长尾:语言理解任务迁移学习研究的元分析。
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2022-09-07 eCollection Date: 2022-10-01 DOI: 10.1162/tacl_a_00500
Aakanksha Naik, Jill Lehman, Carolyn Rosé
{"title":"Adapting to the Long Tail: A Meta-Analysis of Transfer Learning Research for Language Understanding Tasks.","authors":"Aakanksha Naik,&nbsp;Jill Lehman,&nbsp;Carolyn Rosé","doi":"10.1162/tacl_a_00500","DOIUrl":"https://doi.org/10.1162/tacl_a_00500","url":null,"abstract":"<p><p>Natural language understanding (NLU) has made massive progress driven by large benchmarks, but benchmarks often leave a long tail of infrequent phenomena underrepresented. We reflect on the question: <i>Have transfer learning methods sufficiently addressed the poor performance of benchmark-trained models on the long tail?</i> We conceptualize the long tail using macro-level dimensions (underrepresented genres, topics, etc.), and perform a qualitative meta-analysis of 100 representative papers on transfer learning research for NLU. Our analysis asks three questions: (i) Which long tail dimensions do transfer learning studies target? (ii) Which properties of adaptation methods help improve performance on the long tail? (iii) Which methodological gaps have greatest negative impact on long tail performance? Our answers highlight major avenues for future research in transfer learning for the long tail. Lastly, using our meta-analysis framework, we perform a case study comparing the performance of various adaptation methods on clinical narratives, which provides interesting insights that may enable us to make progress along these future avenues.</p>","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"10 ","pages":"956-980"},"PeriodicalIF":10.9,"publicationDate":"2022-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9590102/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40667339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Efficient Methods for Natural Language Processing: A Survey 自然语言处理的有效方法综述
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2022-08-31 DOI: 10.1162/tacl_a_00577
Marcos Vinícius Treviso, Tianchu Ji, Ji-Ung Lee, Betty van Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Pedro Henrique Martins, André F. T. Martins, Peter Milder, Colin Raffel, Edwin Simpson, N. Slonim, Niranjan Balasubramanian, Leon Derczynski, Roy Schwartz
{"title":"Efficient Methods for Natural Language Processing: A Survey","authors":"Marcos Vinícius Treviso, Tianchu Ji, Ji-Ung Lee, Betty van Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Pedro Henrique Martins, André F. T. Martins, Peter Milder, Colin Raffel, Edwin Simpson, N. Slonim, Niranjan Balasubramanian, Leon Derczynski, Roy Schwartz","doi":"10.1162/tacl_a_00577","DOIUrl":"https://doi.org/10.1162/tacl_a_00577","url":null,"abstract":"Abstract Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows. Such resources include data, time, storage, or energy, all of which are naturally limited and unevenly distributed. This motivates research into efficient methods that require fewer resources to achieve similar results. This survey synthesizes and relates current methods and findings in efficient NLP. We aim to provide both guidance for conducting NLP under limited resources, and point towards promising research directions for developing more efficient methods.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"826-860"},"PeriodicalIF":10.9,"publicationDate":"2022-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45729583","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信