First Workshop on Insights from Negative Results in NLP最新文献

筛选
英文 中文
What GPT Knows About Who is Who GPT知道谁是谁
First Workshop on Insights from Negative Results in NLP Pub Date : 2022-05-16 DOI: 10.48550/arXiv.2205.07407
Xiaohan Yang, Eduardo Peynetti, Vasco Meerman, Christy Tanner
{"title":"What GPT Knows About Who is Who","authors":"Xiaohan Yang, Eduardo Peynetti, Vasco Meerman, Christy Tanner","doi":"10.48550/arXiv.2205.07407","DOIUrl":"https://doi.org/10.48550/arXiv.2205.07407","url":null,"abstract":"Coreference resolution – which is a crucial task for understanding discourse and language at large – has yet to witness widespread benefits from large language models (LLMs). Moreover, coreference resolution systems largely rely on supervised labels, which are highly expensive and difficult to annotate, thus making it ripe for prompt engineering. In this paper, we introduce a QA-based prompt-engineering method and discern generative, pre-trained LLMs’ abilities and limitations toward the task of coreference resolution. Our experiments show that GPT-2 and GPT-Neo can return valid answers, but that their capabilities to identify coreferent mentions are limited and prompt-sensitive, leading to inconsistent results.","PeriodicalId":441528,"journal":{"name":"First Workshop on Insights from Negative Results in NLP","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114582215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Pathologies of Pre-trained Language Models in Few-shot Fine-tuning 预训练语言模型在几次微调中的病态
First Workshop on Insights from Negative Results in NLP Pub Date : 2022-04-17 DOI: 10.48550/arXiv.2204.08039
Hanjie Chen, Guoqing Zheng, A. Awadallah, Yangfeng Ji
{"title":"Pathologies of Pre-trained Language Models in Few-shot Fine-tuning","authors":"Hanjie Chen, Guoqing Zheng, A. Awadallah, Yangfeng Ji","doi":"10.48550/arXiv.2204.08039","DOIUrl":"https://doi.org/10.48550/arXiv.2204.08039","url":null,"abstract":"Although adapting pre-trained language models with few examples has shown promising performance on text classification, there is a lack of understanding of where the performance gain comes from. In this work, we propose to answer this question by interpreting the adaptation behavior using post-hoc explanations from model predictions. By modeling feature statistics of explanations, we discover that (1) without fine-tuning, pre-trained models (e.g. BERT and RoBERTa) show strong prediction bias across labels; (2) although few-shot fine-tuning can mitigate the prediction bias and demonstrate promising prediction performance, our analysis shows models gain performance improvement by capturing non-task-related features (e.g. stop words) or shallow data patterns (e.g. lexical overlaps). These observations alert that pursuing model performance with fewer examples may incur pathological prediction behavior, which requires further sanity check on model predictions and careful design in model evaluations in few-shot fine-tuning.","PeriodicalId":441528,"journal":{"name":"First Workshop on Insights from Negative Results in NLP","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128493174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Can Question Rewriting Help Conversational Question Answering? 改写问题有助于对话式问题回答吗?
First Workshop on Insights from Negative Results in NLP Pub Date : 2022-04-13 DOI: 10.48550/arXiv.2204.06239
Etsuko Ishii, Yan Xu, Samuel Cahyawijaya, Bryan Wilie
{"title":"Can Question Rewriting Help Conversational Question Answering?","authors":"Etsuko Ishii, Yan Xu, Samuel Cahyawijaya, Bryan Wilie","doi":"10.48550/arXiv.2204.06239","DOIUrl":"https://doi.org/10.48550/arXiv.2204.06239","url":null,"abstract":"Question rewriting (QR) is a subtask of conversational question answering (CQA) aiming to ease the challenges of understanding dependencies among dialogue history by reformulating questions in a self-contained form. Despite seeming plausible, little evidence is available to justify QR as a mitigation method for CQA. To verify the effectiveness of QR in CQA, we investigate a reinforcement learning approach that integrates QR and CQA tasks and does not require corresponding QR datasets for targeted CQA.We find, however, that the RL method is on par with the end-to-end baseline. We provide an analysis of the failure and describe the difficulty of exploiting QR for CQA.","PeriodicalId":441528,"journal":{"name":"First Workshop on Insights from Negative Results in NLP","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114965207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Extending the Scope of Out-of-Domain: Examining QA models in multiple subdomains 扩展域外范围:检查多个子域中的QA模型
First Workshop on Insights from Negative Results in NLP Pub Date : 2022-04-09 DOI: 10.48550/arXiv.2204.04534
Chenyang Lyu, Jennifer Foster, Yvette Graham
{"title":"Extending the Scope of Out-of-Domain: Examining QA models in multiple subdomains","authors":"Chenyang Lyu, Jennifer Foster, Yvette Graham","doi":"10.48550/arXiv.2204.04534","DOIUrl":"https://doi.org/10.48550/arXiv.2204.04534","url":null,"abstract":"Past work that investigates out-of-domain performance of QA systems has mainly focused on general domains (e.g. news domain, wikipedia domain), underestimating the importance of subdomains defined by the internal characteristics of QA datasets.In this paper, we extend the scope of “out-of-domain” by splitting QA examples into different subdomains according to their internal characteristics including question type, text length, answer position. We then examine the performance of QA systems trained on the data from different subdomains. Experimental results show that the performance of QA systems can be significantly reduced when the train data and test data come from different subdomains. These results question the generalizability of current QA systems in multiple subdomains, suggesting the need to combat the bias introduced by the internal characteristics of QA datasets.","PeriodicalId":441528,"journal":{"name":"First Workshop on Insights from Negative Results in NLP","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121367942","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Do Data-based Curricula Work? 基于数据的课程有效吗?
First Workshop on Insights from Negative Results in NLP Pub Date : 2021-12-13 DOI: 10.18653/v1/2022.insights-1.16
Maxim K. Surkov, Vladislav D. Mosin, Ivan P. Yamshchikov
{"title":"Do Data-based Curricula Work?","authors":"Maxim K. Surkov, Vladislav D. Mosin, Ivan P. Yamshchikov","doi":"10.18653/v1/2022.insights-1.16","DOIUrl":"https://doi.org/10.18653/v1/2022.insights-1.16","url":null,"abstract":"Current state-of-the-art NLP systems use large neural networks that require extensive computational resources for training. Inspired by human knowledge acquisition, researchers have proposed curriculum learning - sequencing tasks (task-based curricula) or ordering and sampling the datasets (data-based curricula) that facilitate training. This work investigates the benefits of data-based curriculum learning for large language models such as BERT and T5. We experiment with various curricula based on complexity measures and different sampling strategies. Extensive experiments on several NLP tasks show that curricula based on various complexity measures rarely have any benefits, while random sampling performs either as well or better than curricula.","PeriodicalId":441528,"journal":{"name":"First Workshop on Insights from Negative Results in NLP","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129101928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Embedding Structured Dictionary Entries 嵌入结构化字典条目
First Workshop on Insights from Negative Results in NLP Pub Date : 2020-11-01 DOI: 10.18653/v1/2020.insights-1.18
Steven R. Wilson, Walid Magdy, Barbara McGillivray, Gareth Tyson
{"title":"Embedding Structured Dictionary Entries","authors":"Steven R. Wilson, Walid Magdy, Barbara McGillivray, Gareth Tyson","doi":"10.18653/v1/2020.insights-1.18","DOIUrl":"https://doi.org/10.18653/v1/2020.insights-1.18","url":null,"abstract":"Previous work has shown how to effectively use external resources such as dictionaries to improve English-language word embeddings, either by manipulating the training process or by applying post-hoc adjustments to the embedding space. We experiment with a multi-task learning approach for explicitly incorporating the structured elements of dictionary entries, such as user-assigned tags and usage examples, when learning embeddings for dictionary headwords. Our work generalizes several existing models for learning word embeddings from dictionaries. However, we find that the most effective representations overall are learned by simply training with a skip-gram objective over the concatenated text of all entries in the dictionary, giving no particular focus to the structure of the entries.","PeriodicalId":441528,"journal":{"name":"First Workshop on Insights from Negative Results in NLP","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129222549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
How Far Can We Go with Data Selection? A Case Study on Semantic Sequence Tagging Tasks 数据选择我们能走多远?语义序列标注任务的实例研究
First Workshop on Insights from Negative Results in NLP Pub Date : 2020-11-01 DOI: 10.18653/v1/2020.insights-1.3
Samuel Louvan, B. Magnini
{"title":"How Far Can We Go with Data Selection? A Case Study on Semantic Sequence Tagging Tasks","authors":"Samuel Louvan, B. Magnini","doi":"10.18653/v1/2020.insights-1.3","DOIUrl":"https://doi.org/10.18653/v1/2020.insights-1.3","url":null,"abstract":"Although several works have addressed the role of data selection to improve transfer learning for various NLP tasks, there is no consensus about its real benefits and, more generally, there is a lack of shared practices on how it can be best applied. We propose a systematic approach aimed at evaluating data selection in scenarios of increasing complexity. Specifically, we compare the case in which source and target tasks are the same while source and target domains are different, against the more challenging scenario where both tasks and domains are different. We run a number of experiments on semantic sequence tagging tasks, which are relatively less investigated in data selection, and conclude that data selection has more benefit on the scenario when the tasks are the same, while in case of different (although related) tasks from distant domains, a combination of data selection and multi-task learning is ineffective for most cases.","PeriodicalId":441528,"journal":{"name":"First Workshop on Insights from Negative Results in NLP","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130784336","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NMF Ensembles? Not for Text Summarization! NMF集合?不是为了文本摘要!
First Workshop on Insights from Negative Results in NLP Pub Date : 2020-11-01 DOI: 10.18653/v1/2020.insights-1.14
Alka Khurana, Vasudha Bhatnagar
{"title":"NMF Ensembles? Not for Text Summarization!","authors":"Alka Khurana, Vasudha Bhatnagar","doi":"10.18653/v1/2020.insights-1.14","DOIUrl":"https://doi.org/10.18653/v1/2020.insights-1.14","url":null,"abstract":"Non-negative Matrix Factorization (NMF) has been used for text analytics with promising results. Instability of results arising due to stochastic variations during initialization makes a case for use of ensemble technology. However, our extensive empirical investigation indicates otherwise. In this paper, we establish that ensemble summary for single document using NMF is no better than the best base model summary.","PeriodicalId":441528,"journal":{"name":"First Workshop on Insights from Negative Results in NLP","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126771817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Which Matters Most? Comparing the Impact of Concept and Document Relationships in Topic Models 哪个最重要?比较主题模型中概念和文档关系的影响
First Workshop on Insights from Negative Results in NLP Pub Date : 2020-11-01 DOI: 10.18653/v1/2020.insights-1.5
Silvia Terragni, Debora Nozza, E. Fersini, M. Enza
{"title":"Which Matters Most? Comparing the Impact of Concept and Document Relationships in Topic Models","authors":"Silvia Terragni, Debora Nozza, E. Fersini, M. Enza","doi":"10.18653/v1/2020.insights-1.5","DOIUrl":"https://doi.org/10.18653/v1/2020.insights-1.5","url":null,"abstract":"Topic models have been widely used to discover hidden topics in a collection of documents. In this paper, we propose to investigate the role of two different types of relational information, i.e. document relationships and concept relationships. While exploiting the document network significantly improves topic coherence, the introduction of concepts and their relationships does not influence the results both quantitatively and qualitatively.","PeriodicalId":441528,"journal":{"name":"First Workshop on Insights from Negative Results in NLP","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121110617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
If You Build Your Own NER Scorer, Non-replicable Results Will Come 如果你建立自己的NER记分器,不可复制的结果将会到来
First Workshop on Insights from Negative Results in NLP Pub Date : 2020-11-01 DOI: 10.18653/v1/2020.insights-1.15
Constantine Lignos, Marjan Kamyab
{"title":"If You Build Your Own NER Scorer, Non-replicable Results Will Come","authors":"Constantine Lignos, Marjan Kamyab","doi":"10.18653/v1/2020.insights-1.15","DOIUrl":"https://doi.org/10.18653/v1/2020.insights-1.15","url":null,"abstract":"We attempt to replicate a named entity recognition (NER) model implemented in a popular toolkit and discover that a critical barrier to doing so is the inconsistent evaluation of improper label sequences. We define these sequences and examine how two scorers differ in their handling of them, finding that one approach produces F1 scores approximately 0.5 points higher on the CoNLL 2003 English development and test sets. We propose best practices to increase the replicability of NER evaluations by increasing transparency regarding the handling of improper label sequences.","PeriodicalId":441528,"journal":{"name":"First Workshop on Insights from Negative Results in NLP","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129492127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信