Analyzing Semantic Faithfulness of Language Models via Input Intervention on Question Answering

IF 9.3 2区 计算机科学
Akshay Chaturvedi, Soumadeep Saha, Nicholas Asher, Swarnadeep Bhar, Utpal Garain
{"title":"Analyzing Semantic Faithfulness of Language Models via Input Intervention on Question Answering","authors":"Akshay Chaturvedi, Soumadeep Saha, Nicholas Asher, Swarnadeep Bhar, Utpal Garain","doi":"10.1162/coli_a_00493","DOIUrl":null,"url":null,"abstract":"Transformer-based language models have been shown to be highly effective for several NLP tasks. In this paper, we consider three transformer models, BERT, RoBERTa, and XLNet, in both small and large versions, and investigate how faithful their representations are with respect to the semantic content of texts. We formalize a notion of semantic faithfulness, in which the semantic content of a text should causally figure in a model's inferences in question answering. We then test this notion by observing a model's behavior on answering questions about a story after performing two novel semantic interventions—deletion intervention and negation intervention. While transformer models achieve high performance on standard question answering tasks, we show that they fail to be semantically faithful once we perform these interventions for a significant number of cases (∼ 50% for deletion intervention, and ∼ 20% drop in accuracy for negation intervention). We then propose an intervention-based training regime that can mitigate the undesirable effects for deletion intervention by a significant margin (from ∼ 50% to ∼ 6%). We analyze the inner-workings of the models to better understand the effectiveness of intervention-based training for deletion intervention. But we show that this training does not attenuate other aspects of semantic unfaithfulness such as the models' inability to deal with negation intervention or to capture the predicate-argument structure of texts. We also test InstructGPT, via prompting, for its ability to handle the two interventions and to capture predicate-argument structure. While InstructGPT models do achieve very high performance on predicate-argument structure task, they fail to respond adequately to our deletion and negation interventions.","PeriodicalId":49089,"journal":{"name":"Computational Linguistics","volume":"6 1","pages":""},"PeriodicalIF":9.3000,"publicationDate":"2023-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Linguistics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/coli_a_00493","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Transformer-based language models have been shown to be highly effective for several NLP tasks. In this paper, we consider three transformer models, BERT, RoBERTa, and XLNet, in both small and large versions, and investigate how faithful their representations are with respect to the semantic content of texts. We formalize a notion of semantic faithfulness, in which the semantic content of a text should causally figure in a model's inferences in question answering. We then test this notion by observing a model's behavior on answering questions about a story after performing two novel semantic interventions—deletion intervention and negation intervention. While transformer models achieve high performance on standard question answering tasks, we show that they fail to be semantically faithful once we perform these interventions for a significant number of cases (∼ 50% for deletion intervention, and ∼ 20% drop in accuracy for negation intervention). We then propose an intervention-based training regime that can mitigate the undesirable effects for deletion intervention by a significant margin (from ∼ 50% to ∼ 6%). We analyze the inner-workings of the models to better understand the effectiveness of intervention-based training for deletion intervention. But we show that this training does not attenuate other aspects of semantic unfaithfulness such as the models' inability to deal with negation intervention or to capture the predicate-argument structure of texts. We also test InstructGPT, via prompting, for its ability to handle the two interventions and to capture predicate-argument structure. While InstructGPT models do achieve very high performance on predicate-argument structure task, they fail to respond adequately to our deletion and negation interventions.
基于问答输入干预的语言模型语义忠实度分析
基于转换器的语言模型已被证明对一些NLP任务非常有效。在本文中,我们考虑了三个转换模型,BERT, RoBERTa和XLNet,在小版本和大版本中,并研究了它们的表示对文本语义内容的忠实程度。我们形式化了语义忠实的概念,其中文本的语义内容应该在问答模型的推理中因果关系地出现。然后,我们通过观察模型在执行两种新的语义干预-删除干预和否定干预后回答关于故事的问题的行为来验证这一概念。虽然转换模型在标准问答任务中实现了高性能,但我们表明,一旦我们在大量情况下执行这些干预,它们就不能在语义上忠实(删除干预为50%,否定干预为20%)。然后,我们提出了一种基于干预的培训制度,可以显著减轻缺失干预的不良影响(从50%到6%)。我们分析了模型的内部工作原理,以更好地理解基于干预的缺失干预训练的有效性。但我们表明,这种训练并没有减弱语义不忠实的其他方面,例如模型无法处理否定干预或捕捉文本的谓词-论证结构。我们还通过提示测试了InstructGPT处理两个干预和捕获谓词-参数结构的能力。尽管InstructGPT模型在谓词-参数结构任务上确实实现了非常高的性能,但它们无法充分响应我们的删除和否定干预。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computational Linguistics
Computational Linguistics Computer Science-Artificial Intelligence
自引率
0.00%
发文量
45
期刊介绍: Computational Linguistics is the longest-running publication devoted exclusively to the computational and mathematical properties of language and the design and analysis of natural language processing systems. This highly regarded quarterly offers university and industry linguists, computational linguists, artificial intelligence and machine learning investigators, cognitive scientists, speech specialists, and philosophers the latest information about the computational aspects of all the facets of research on language.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信