North American Chapter of the Association for Computational Linguistics最新文献_第5页

Towards Debiasing Translation Artifacts 消除翻译工件的偏见

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-16 DOI: 10.48550/arXiv.2205.08001

Koel Dutta Chowdhury, Rricha Jalota, C. España-Bonet, Josef van Genabith

引用次数: 5

FactPEGASUS: Factuality-Aware Pre-training and Fine-tuning for Abstractive Summarization FactPEGASUS:面向抽象摘要的事实感知预训练和微调

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-16 DOI: 10.48550/arXiv.2205.07830

David Wan, Mohit Bansal

引用次数: 37

Hero-Gang Neural Model For Named Entity Recognition 命名实体识别的Hero-Gang神经模型

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-15 DOI: 10.48550/arXiv.2205.07177

Jinpeng Hu, Yaling Shen, Yang Liu, Xiang Wan, Tsung-Hui Chang

{"title":"Hero-Gang Neural Model For Named Entity Recognition","authors":"Jinpeng Hu, Yaling Shen, Yang Liu, Xiang Wan, Tsung-Hui Chang","doi":"10.48550/arXiv.2205.07177","DOIUrl":"https://doi.org/10.48550/arXiv.2205.07177","url":null,"abstract":"Named entity recognition (NER) is a fundamental and important task in NLP, aiming at identifying named entities (NEs) from free text. Recently, since the multi-head attention mechanism applied in the Transformer model can effectively capture longer contextual information, Transformer-based models have become the mainstream methods and have achieved significant performance in this task. Unfortunately, although these models can capture effective global context information, they are still limited in the local feature and position information extraction, which is critical in NER. In this paper, to address this limitation, we propose a novel Hero-Gang Neural structure (HGN), including the Hero and Gang module, to leverage both global and local information to promote NER. Specifically, the Hero module is composed of a Transformer-based encoder to maintain the advantage of the self-attention mechanism, and the Gang module utilizes a multi-window recurrent module to extract local features and position information under the guidance of the Hero module. Afterward, the proposed multi-window attention effectively combines global information and multiple local features for predicting entity labels. Experimental results on several benchmark datasets demonstrate the effectiveness of our proposed model.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"162 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123485253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation 多路转换器:一种基于头部可配置变压器的直接语音翻译模型

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-14 DOI: 10.48550/arXiv.2205.07100

Gerard Sant, Gerard I. Gállego, Belen Alastruey, M. Costa-jussà

{"title":"Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation","authors":"Gerard Sant, Gerard I. Gállego, Belen Alastruey, M. Costa-jussà","doi":"10.48550/arXiv.2205.07100","DOIUrl":"https://doi.org/10.48550/arXiv.2205.07100","url":null,"abstract":"Transformer-based models have been achieving state-of-the-art results in several fields of Natural Language Processing. However, its direct application to speech tasks is not trivial. The nature of this sequences carries problems such as long sequence lengths and redundancy between adjacent tokens. Therefore, we believe that regular self-attention mechanism might not be well suited for it. Different approaches have been proposed to overcome these problems, such as the use of efficient attention mechanisms. However, the use of these methods usually comes with a cost, which is a performance reduction caused by information loss. In this study, we present the Multiformer, a Transformer-based model which allows the use of different attention mechanisms on each head. By doing this, the model is able to bias the self-attention towards the extraction of more diverse token interactions, and the information loss is reduced. Finally, we perform an analysis of the head contributions, and we observe that those architectures where all heads relevance is uniformly distributed obtain better results. Our results show that mixing attention patterns along the different heads and layers outperforms our baseline by up to 0.7 BLEU.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"408 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133600450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

MuCPAD: A Multi-Domain Chinese Predicate-Argument Dataset MuCPAD:一个多域汉语谓词-参数数据集

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-13 DOI: 10.48550/arXiv.2205.06703

Yahui Liu, Hao Yang, Chen Gong, Qingrong Xia, Zhenghua Li, M. Zhang

引用次数: 1

Exploiting Inductive Bias in Transformers for Unsupervised Disentanglement of Syntax and Semantics with VAEs 利用变压器中的归纳偏置进行无监督的语义和语法解缠

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-12 DOI: 10.48550/arXiv.2205.05943

G. Felhi, Joseph Le Roux, Djamé Seddah

{"title":"Exploiting Inductive Bias in Transformers for Unsupervised Disentanglement of Syntax and Semantics with VAEs","authors":"G. Felhi, Joseph Le Roux, Djamé Seddah","doi":"10.48550/arXiv.2205.05943","DOIUrl":"https://doi.org/10.48550/arXiv.2205.05943","url":null,"abstract":"We propose a generative model for text generation, which exhibits disentangled latent representations of syntax and semantics. Contrary to previous work, this model does not need syntactic information such as constituency parses, or semantic information such as paraphrase pairs. Our model relies solely on the inductive bias found in attention-based architectures such as Transformers. In the attention of Transformers, keys handle information selection while values specify what information is conveyed. Our model, dubbed QKVAE, uses Attention in its decoder to read latent variables where one latent variable infers keys while another infers values. We run experiments on latent representations and experiments on syntax/semantics transfer which show that QKVAE displays clear signs of disentangled syntax and semantics. We also show that our model displays competitive syntax transfer capabilities when compared to supervised models and that comparable supervised models need a fairly large amount of data (more than 50K samples) to outperform it on both syntactic and semantic transfer. The code for our experiments is publicly available.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123253665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

TreeMix: Compositional Constituency-based Data Augmentation for Natural Language Understanding TreeMix:用于自然语言理解的基于组成成分的数据增强

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-12 DOI: 10.48550/arXiv.2205.06153

Le Zhang, Zichao Yang, Diyi Yang

引用次数: 15

Lifting the Curse of Multilinguality by Pre-training Modular Transformers 通过预训练模块变压器解除多语言诅咒

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-12 DOI: 10.48550/arXiv.2205.06266

Jonas Pfeiffer, Naman Goyal, Xi Victoria Lin, Xian Li, James Cross, Sebastian Riedel, Mikel Artetxe

引用次数: 69

A Computational Acquisition Model for Multimodal Word Categorization 多模态词分类的计算习得模型

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-12 DOI: 10.48550/arXiv.2205.05974

Uri Berger, Gabriel Stanovsky, Omri Abend, Lea Frermann

引用次数: 5

Cryptocurrency Bubble Detection: A New Stock Market Dataset, Financial Task & Hyperbolic Models 加密货币泡沫检测:一个新的股票市场数据集，金融任务和双曲模型

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-11 DOI: 10.48550/arXiv.2206.06320

Ramit Sawhney, Shivam Agarwal, Vivek Mittal, Paolo Rosso, Vikram Nanda, S. Chava

{"title":"Cryptocurrency Bubble Detection: A New Stock Market Dataset, Financial Task & Hyperbolic Models","authors":"Ramit Sawhney, Shivam Agarwal, Vivek Mittal, Paolo Rosso, Vikram Nanda, S. Chava","doi":"10.48550/arXiv.2206.06320","DOIUrl":"https://doi.org/10.48550/arXiv.2206.06320","url":null,"abstract":"The rapid spread of information over social media influences quantitative trading and investments. The growing popularity of speculative trading of highly volatile assets such as cryptocurrencies and meme stocks presents a fresh challenge in the financial realm. Investigating such “bubbles” - periods of sudden anomalous behavior of markets are critical in better understanding investor behavior and market dynamics. However, high volatility coupled with massive volumes of chaotic social media texts, especially for underexplored assets like cryptocoins pose a challenge to existing methods. Taking the first step towards NLP for cryptocoins, we present and publicly release CryptoBubbles, a novel multi- span identification task for bubble detection, and a dataset of more than 400 cryptocoins from 9 exchanges over five years spanning over two million tweets. Further, we develop a set of sequence-to-sequence hyperbolic models suited to this multi-span identification task based on the power-law dynamics of cryptocurrencies and user behavior on social media. We further test the effectiveness of our models under zero-shot settings on a test set of Reddit posts pertaining to 29 “meme stocks”, which see an increase in trade volume due to social media hype. Through quantitative, qualitative, and zero-shot analyses on Reddit and Twitter spanning cryptocoins and meme-stocks, we show the practical applicability of CryptoBubbles and hyperbolic models.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"276 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122124181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10