North American Chapter of the Association for Computational Linguistics最新文献_第10页

Training Mixed-Domain Translation Models via Federated Learning 通过联邦学习训练混合领域翻译模型

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-03 DOI: 10.48550/arXiv.2205.01557

P. Passban, T. Roosta, Rahul Gupta, Ankit R. Chadha, Clement Chung

{"title":"Training Mixed-Domain Translation Models via Federated Learning","authors":"P. Passban, T. Roosta, Rahul Gupta, Ankit R. Chadha, Clement Chung","doi":"10.48550/arXiv.2205.01557","DOIUrl":"https://doi.org/10.48550/arXiv.2205.01557","url":null,"abstract":"Training mixed-domain translation models is a complex task that demands tailored architec- tures and costly data preparation techniques. In this work, we leverage federated learning (FL) in order to tackle the problem. Our investiga- tion demonstrates that with slight modifications in the training process, neural machine trans- lation (NMT) engines can be easily adapted when an FL-based aggregation is applied to fuse different domains. Experimental results also show that engines built via FL are able to perform on par with state-of-the-art baselines that rely on centralized training techniques.We evaluate our hypothesis in the presence of five datasets with different sizes, from different domains, to translate from German into English and discuss how FL and NMT can mutually benefit from each other. In addition to provid- ing benchmarking results on the union of FL and NMT, we also propose a novel technique to dynamically control the communication band- width by selecting impactful parameters during FL updates. This is a significant achievement considering the large size of NMT engines that need to be exchanged between FL parties.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116215545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Adaptable Adapters 适应性强的适配器

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-03 DOI: 10.48550/arXiv.2205.01549

N. Moosavi, Quentin Delfosse, K. Kersting, Iryna Gurevych

{"title":"Adaptable Adapters","authors":"N. Moosavi, Quentin Delfosse, K. Kersting, Iryna Gurevych","doi":"10.48550/arXiv.2205.01549","DOIUrl":"https://doi.org/10.48550/arXiv.2205.01549","url":null,"abstract":"State-of-the-art pretrained NLP models contain a hundred million to trillion parameters. Adapters provide a parameter-efficient alternative for the full finetuning in which we can only finetune lightweight neural network layers on top of pretrained weights. Adapter layers are initialized randomly. However, existing work uses the same adapter architecture—i.e., the same adapter layer on top of each layer of the pretrained model—for every dataset, regardless of the properties of the dataset or the amount of available training data. In this work, we introduce adaptable adapters that contain (1) learning different activation functions for different layers and different input data, and (2) a learnable switch to select and only use the beneficial adapter layers. We show that adaptable adapters achieve on-par performances with the standard adapter architecture while using a considerably smaller number of adapter layers. In addition, we show that the selected adapter architecture by adaptable adapters transfers well across different data settings and similar tasks. We propose to use adaptable adapters for designing efficient and effective adapter architectures. The resulting adapters (a) contain about 50% of the learning parameters of the standard adapter and are therefore more efficient at training and inference, and require less storage space, and (b) achieve considerably higher performances in low-data settings.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127329428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

A Holistic Framework for Analyzing the COVID-19 Vaccine Debate 分析COVID-19疫苗辩论的整体框架

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-03 DOI: 10.18653/v1/2022.naacl-main.427

Maria Leonor Pacheco, Tunazzina Islam, Monal Mahajan, Andrey Shor, Ming Yin, Pallavi V. Kulkarni, Dan Goldwasser

引用次数: 12

Embedding Hallucination for Few-shot Language Fine-tuning 嵌入幻觉的几次语言微调

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-03 DOI: 10.48550/arXiv.2205.01307

Yiren Jian, Chongyang Gao, Soroush Vosoughi

引用次数: 0

SUBS: Subtree Substitution for Compositional Semantic Parsing SUBS:组合语义解析的子树替代

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-03 DOI: 10.48550/arXiv.2205.01538

Jingfeng Yang, Le Zhang, Diyi Yang

引用次数: 14

Semantic Diversity in Dialogue with Natural Language Inference 对话中的语义多样性与自然语言推理

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-03 DOI: 10.48550/arXiv.2205.01497

Katherine Stasaski, Marti A. Hearst

引用次数: 11

CTM - A Model for Large-Scale Multi-View Tweet Topic Classification CTM -一种大规模多视图推文主题分类模型

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-03 DOI: 10.48550/arXiv.2205.01603

Vivek Kulkarni, Kenny Leung, A. Haghighi

{"title":"CTM - A Model for Large-Scale Multi-View Tweet Topic Classification","authors":"Vivek Kulkarni, Kenny Leung, A. Haghighi","doi":"10.48550/arXiv.2205.01603","DOIUrl":"https://doi.org/10.48550/arXiv.2205.01603","url":null,"abstract":"Automatically associating social media posts with topics is an important prerequisite for effective search and recommendation on many social media platforms. However, topic classification of such posts is quite challenging because of (a) a large topic space (b) short text with weak topical cues, and (c) multiple topic associations per post. In contrast to most prior work which only focuses on post-classification into a small number of topics (10-20), we consider the task of large-scale topic classification in the context of Twitter where the topic space is 10 times larger with potentially multiple topic associations per Tweet. We address the challenges above and propose a novel neural model, that (a) supports a large topic space of 300 topics (b) takes a holistic approach to tweet content modeling – leveraging multi-modal content, author context, and deeper semantic cues in the Tweet. Our method offers an effective way to classify Tweets into topics at scale by yielding superior performance to other approaches (a relative lift of mathbf{20}% in median average precision score) and has been successfully deployed in production at Twitter.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"2006 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127657506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Semantically Informed Slang Interpretation 语义上的俚语解释

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-02 DOI: 10.48550/arXiv.2205.00616

Zhewei Sun, R. Zemel, Yang Xu

引用次数: 1

Teaching BERT to Wait: Balancing Accuracy and Latency for Streaming Disfluency Detection 教BERT等待:平衡流不流畅检测的准确性和延迟

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-02 DOI: 10.48550/arXiv.2205.00620

Angelica Chen, V. Zayats, D. D. Walker, D. Padfield

{"title":"Teaching BERT to Wait: Balancing Accuracy and Latency for Streaming Disfluency Detection","authors":"Angelica Chen, V. Zayats, D. D. Walker, D. Padfield","doi":"10.48550/arXiv.2205.00620","DOIUrl":"https://doi.org/10.48550/arXiv.2205.00620","url":null,"abstract":"In modern interactive speech-based systems, speech is consumed and transcribed incrementally prior to having disfluencies removed. While this post-processing step is crucial for producing clean transcripts and high performance on downstream tasks (e.g. machine translation), most current state-of-the-art NLP models such as the Transformer operate non-incrementally, potentially causing unacceptable delays for the user. In this work we propose a streaming BERT-based sequence tagging model that, combined with a novel training objective, is capable of detecting disfluencies in real-time while balancing accuracy and latency. This is accomplished by training the model to decide whether to immediately output a prediction for the current input or to wait for further context, in essence learning to dynamically size the lookahead window. Our results demonstrate that our model produces comparably accurate predictions and does so sooner than our baselines, with lower flicker. Furthermore, the model attains state-of-the-art latency and stability scores when compared with recent work on incremental disfluency detection.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124281822","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

Quality-Aware Decoding for Neural Machine Translation 神经机器翻译的质量感知解码

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-02 DOI: 10.48550/arXiv.2205.00978

Patrick Fernandes, António Farinhas, Ricardo Rei, José G. C. de Souza, Perez Ogayo, Graham Neubig, André F. T. Martins

引用次数: 39