Proceedings of the Second Workshop on Insights from Negative Results in NLP最新文献

筛选
英文 中文
The Highs and Lows of Simple Lexical Domain Adaptation Approaches for Neural Machine Translation 神经机器翻译中简单词汇域自适应方法的优缺点
Proceedings of the Second Workshop on Insights from Negative Results in NLP Pub Date : 2021-01-02 DOI: 10.18653/v1/2021.insights-1.12
Nikolay Bogoychev, Pinzhen Chen
{"title":"The Highs and Lows of Simple Lexical Domain Adaptation Approaches for Neural Machine Translation","authors":"Nikolay Bogoychev, Pinzhen Chen","doi":"10.18653/v1/2021.insights-1.12","DOIUrl":"https://doi.org/10.18653/v1/2021.insights-1.12","url":null,"abstract":"Machine translation systems are vulnerable to domain mismatch, especially in a low-resource scenario. Out-of-domain translations are often of poor quality and prone to hallucinations, due to exposure bias and the decoder acting as a language model. We adopt two approaches to alleviate this problem: lexical shortlisting restricted by IBM statistical alignments, and hypothesis reranking based on similarity. The methods are computationally cheap and show success on low-resource out-of-domain test sets. However, the methods lose advantage when there is sufficient data or too great domain mismatch. This is due to both the IBM model losing its advantage over the implicitly learned neural alignment, and issues with subword segmentation of unseen words.","PeriodicalId":166055,"journal":{"name":"Proceedings of the Second Workshop on Insights from Negative Results in NLP","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113935376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Corrected CBOW Performs as well as Skip-gram 校正后的CBOW表现与Skip-gram一样好
Proceedings of the Second Workshop on Insights from Negative Results in NLP Pub Date : 2020-12-30 DOI: 10.18653/v1/2021.insights-1.1
Ozan Irsoy, Adrian Benton, K. Stratos
{"title":"Corrected CBOW Performs as well as Skip-gram","authors":"Ozan Irsoy, Adrian Benton, K. Stratos","doi":"10.18653/v1/2021.insights-1.1","DOIUrl":"https://doi.org/10.18653/v1/2021.insights-1.1","url":null,"abstract":"Mikolov et al. (2013a) observed that continuous bag-of-words (CBOW) word embeddings tend to underperform Skip-gram (SG) embeddings, and this finding has been reported in subsequent works. We find that these observations are driven not by fundamental differences in their training objectives, but more likely on faulty negative sampling CBOW implementations in popular libraries such as the official implementation, word2vec.c, and Gensim. We show that after correcting a bug in the CBOW gradient update, one can learn CBOW word embeddings that are fully competitive with SG on various intrinsic and extrinsic tasks, while being many times faster to train.","PeriodicalId":166055,"journal":{"name":"Proceedings of the Second Workshop on Insights from Negative Results in NLP","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126571582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
An Investigation into the Contribution of Locally Aggregated Descriptors to Figurative Language Identification 局部聚合描述符对比喻语言识别的贡献研究
Proceedings of the Second Workshop on Insights from Negative Results in NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.insights-1.15
Sina Mahdipour Saravani, Ritwik Banerjee, I. Ray
{"title":"An Investigation into the Contribution of Locally Aggregated Descriptors to Figurative Language Identification","authors":"Sina Mahdipour Saravani, Ritwik Banerjee, I. Ray","doi":"10.18653/v1/2021.insights-1.15","DOIUrl":"https://doi.org/10.18653/v1/2021.insights-1.15","url":null,"abstract":"In natural language understanding, topics that touch upon figurative language and pragmatics are notably difficult. We probe a novel use of locally aggregated descriptors – specifically, an architecture called NeXtVLAD – motivated by its accomplishments in computer vision, achieve tremendous success in the FigLang2020 sarcasm detection task. The reported F1 score of 93.1% is 14% higher than the next best result. We specifically investigate the extent to which the novel architecture is responsible for this boost, and find that it does not provide statistically significant benefits. Deep learning approaches are expensive, and we hope our insights highlighting the lack of benefits from introducing a resource-intensive component will aid future research to distill the effective elements from long and complex pipelines, thereby providing a boost to the wider research community.","PeriodicalId":166055,"journal":{"name":"Proceedings of the Second Workshop on Insights from Negative Results in NLP","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128380568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Recurrent Attention for the Transformer 对变压器的经常性关注
Proceedings of the Second Workshop on Insights from Negative Results in NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.insights-1.10
Jan Rosendahl, Christian Herold, Frithjof Petrick, H. Ney
{"title":"Recurrent Attention for the Transformer","authors":"Jan Rosendahl, Christian Herold, Frithjof Petrick, H. Ney","doi":"10.18653/v1/2021.insights-1.10","DOIUrl":"https://doi.org/10.18653/v1/2021.insights-1.10","url":null,"abstract":"In this work, we conduct a comprehensive investigation on one of the centerpieces of modern machine translation systems: the encoder-decoder attention mechanism. Motivated by the concept of first-order alignments, we extend the (cross-)attention mechanism by a recurrent connection, allowing direct access to previous attention/alignment decisions. We propose several ways to include such a recurrency into the attention mechanism. Verifying their performance across different translation tasks we conclude that these extensions and dependencies are not beneficial for the translation performance of the Transformer architecture.","PeriodicalId":166055,"journal":{"name":"Proceedings of the Second Workshop on Insights from Negative Results in NLP","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123746262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
When does Further Pre-training MLM Help? An Empirical Study on Task-Oriented Dialog Pre-training 进一步的预培训对传销有什么帮助?任务导向对话预训练的实证研究
Proceedings of the Second Workshop on Insights from Negative Results in NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.insights-1.9
Qi Zhu, Yuxian Gu, Lingxiao Luo, Bing Li, Cheng Li, Wei Peng, Minlie Huang, Xiaoyan Zhu
{"title":"When does Further Pre-training MLM Help? An Empirical Study on Task-Oriented Dialog Pre-training","authors":"Qi Zhu, Yuxian Gu, Lingxiao Luo, Bing Li, Cheng Li, Wei Peng, Minlie Huang, Xiaoyan Zhu","doi":"10.18653/v1/2021.insights-1.9","DOIUrl":"https://doi.org/10.18653/v1/2021.insights-1.9","url":null,"abstract":"Further pre-training language models on in-domain data (domain-adaptive pre-training, DAPT) or task-relevant data (task-adaptive pre-training, TAPT) before fine-tuning has been shown to improve downstream tasks’ performances. However, in task-oriented dialog modeling, we observe that further pre-training MLM does not always boost the performance on a downstream task. We find that DAPT is beneficial in the low-resource setting, but as the fine-tuning data size grows, DAPT becomes less beneficial or even useless, and scaling the size of DAPT data does not help. Through Representational Similarity Analysis, we conclude that more data for fine-tuning yields greater change of the model’s representations and thus reduces the influence of initialization.","PeriodicalId":166055,"journal":{"name":"Proceedings of the Second Workshop on Insights from Negative Results in NLP","volume":"137 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114501379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Are BERTs Sensitive to Native Interference in L2 Production? bert对L2生产中的本地干扰敏感吗?
Proceedings of the Second Workshop on Insights from Negative Results in NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.insights-1.6
Zixin Tang, P. Mitra, D. Reitter
{"title":"Are BERTs Sensitive to Native Interference in L2 Production?","authors":"Zixin Tang, P. Mitra, D. Reitter","doi":"10.18653/v1/2021.insights-1.6","DOIUrl":"https://doi.org/10.18653/v1/2021.insights-1.6","url":null,"abstract":"With the essays part from The International Corpus Network of Asian Learners of English (ICNALE) and the TOEFL11 corpus, we fine-tuned neural language models based on BERT to predict English learners’ native languages. Results showed neural models can learn to represent and detect such native language impacts, but multilingually trained models have no advantage in doing so.","PeriodicalId":166055,"journal":{"name":"Proceedings of the Second Workshop on Insights from Negative Results in NLP","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121331769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Learning Data Augmentation Schedules for Natural Language Processing 自然语言处理的学习数据增强计划
Proceedings of the Second Workshop on Insights from Negative Results in NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.insights-1.14
Daphné Chopard, M. Treder, Irena Spasic
{"title":"Learning Data Augmentation Schedules for Natural Language Processing","authors":"Daphné Chopard, M. Treder, Irena Spasic","doi":"10.18653/v1/2021.insights-1.14","DOIUrl":"https://doi.org/10.18653/v1/2021.insights-1.14","url":null,"abstract":"Despite its proven efficiency in other fields, data augmentation is less popular in the context of natural language processing (NLP) due to its complexity and limited results. A recent study (Longpre et al., 2020) showed for example that task-agnostic data augmentations fail to consistently boost the performance of pretrained transformers even in low data regimes. In this paper, we investigate whether data-driven augmentation scheduling and the integration of a wider set of transformations can lead to improved performance where fixed and limited policies were unsuccessful. Our results suggest that, while this approach can help the training process in some settings, the improvements are unsubstantial. This negative result is meant to help researchers better understand the limitations of data augmentation for NLP.","PeriodicalId":166055,"journal":{"name":"Proceedings of the Second Workshop on Insights from Negative Results in NLP","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131024190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Zero-Shot Cross-Lingual Transfer is a Hard Baseline to Beat in German Fine-Grained Entity Typing 零距离跨语言迁移是德语细粒度实体分型中难以突破的基线
Proceedings of the Second Workshop on Insights from Negative Results in NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.insights-1.7
Sabine Weber, Mark Steedman
{"title":"Zero-Shot Cross-Lingual Transfer is a Hard Baseline to Beat in German Fine-Grained Entity Typing","authors":"Sabine Weber, Mark Steedman","doi":"10.18653/v1/2021.insights-1.7","DOIUrl":"https://doi.org/10.18653/v1/2021.insights-1.7","url":null,"abstract":"The training of NLP models often requires large amounts of labelled training data, which makes it difficult to expand existing models to new languages. While zero-shot cross-lingual transfer relies on multilingual word embeddings to apply a model trained on one language to another, Yarowski and Ngai (2001) propose the method of annotation projection to generate training data without manual annotation. This method was successfully used for the tasks of named entity recognition and coarse-grained entity typing, but we show that it is outperformed by zero-shot cross-lingual transfer when applied to the similar task of fine-grained entity typing. In our study of fine-grained entity typing with the FIGER type ontology for German, we show that annotation projection amplifies the English model’s tendency to underpredict level 2 labels and is beaten by zero-shot cross-lingual transfer on three novel test sets.","PeriodicalId":166055,"journal":{"name":"Proceedings of the Second Workshop on Insights from Negative Results in NLP","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128270937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Backtranslation in Neural Morphological Inflection 神经形态变化的反翻译
Proceedings of the Second Workshop on Insights from Negative Results in NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.insights-1.13
Ling Liu, Mans Hulden
{"title":"Backtranslation in Neural Morphological Inflection","authors":"Ling Liu, Mans Hulden","doi":"10.18653/v1/2021.insights-1.13","DOIUrl":"https://doi.org/10.18653/v1/2021.insights-1.13","url":null,"abstract":"Backtranslation is a common technique for leveraging unlabeled data in low-resource scenarios in machine translation. The method is directly applicable to morphological inflection generation if unlabeled word forms are available. This paper evaluates the potential of backtranslation for morphological inflection using data from six languages with labeled data drawn from the SIGMORPHON shared task resource and unlabeled data from different sources. Our core finding is that backtranslation can offer modest improvements in low-resource scenarios, but only if the unlabeled data is very clean and has been filtered by the same annotation standards as the labeled data.","PeriodicalId":166055,"journal":{"name":"Proceedings of the Second Workshop on Insights from Negative Results in NLP","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133333693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Two Heads are Better than One? Verification of Ensemble Effect in Neural Machine Translation 三个臭皮匠胜过一个诸葛亮?神经网络机器翻译中集成效应的验证
Proceedings of the Second Workshop on Insights from Negative Results in NLP Pub Date : 1900-01-01 DOI: 10.18653/v1/2021.insights-1.4
Chanjun Park, Sungjin Park, Seolhwa Lee, Taesun Whang, Heuiseok Lim
{"title":"Two Heads are Better than One? Verification of Ensemble Effect in Neural Machine Translation","authors":"Chanjun Park, Sungjin Park, Seolhwa Lee, Taesun Whang, Heuiseok Lim","doi":"10.18653/v1/2021.insights-1.4","DOIUrl":"https://doi.org/10.18653/v1/2021.insights-1.4","url":null,"abstract":"In the field of natural language processing, ensembles are broadly known to be effective in improving performance. This paper analyzes how ensemble of neural machine translation (NMT) models affect performance improvement by designing various experimental setups (i.e., intra-, inter-ensemble, and non-convergence ensemble). To an in-depth examination, we analyze each ensemble method with respect to several aspects such as different attention models and vocab strategies. Experimental results show that ensembling is not always resulting in performance increases and give noteworthy negative findings.","PeriodicalId":166055,"journal":{"name":"Proceedings of the Second Workshop on Insights from Negative Results in NLP","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122463393","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信