Neural models for Factual Inconsistency Classification with Explanations

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference) Pub Date : 2023-06-15 DOI:10.48550/arXiv.2306.08872

Tathagata Raha, Mukund Choudhary, Abhinav Menon, Harshit Gupta, KV Aditya Srivatsa, Manish Gupta, Vasudeva Varma

{"title":"Neural models for Factual Inconsistency Classification with Explanations","authors":"Tathagata Raha, Mukund Choudhary, Abhinav Menon, Harshit Gupta, KV Aditya Srivatsa, Manish Gupta, Vasudeva Varma","doi":"10.48550/arXiv.2306.08872","DOIUrl":null,"url":null,"abstract":"Factual consistency is one of the most important requirements when editing high quality documents. It is extremely important for automatic text generation systems like summarization, question answering, dialog modeling, and language modeling. Still, automated factual inconsistency detection is rather under-studied. Existing work has focused on (a) finding fake news keeping a knowledge base in context, or (b) detecting broad contradiction (as part of natural language inference literature). However, there has been no work on detecting and explaining types of factual inconsistencies in text, without any knowledge base in context. In this paper, we leverage existing work in linguistics to formally define five types of factual inconsistencies. Based on this categorization, we contribute a novel dataset, FICLE (Factual Inconsistency CLassification with Explanation), with ~8K samples where each sample consists of two sentences (claim and context) annotated with type and span of inconsistency. When the inconsistency relates to an entity type, it is labeled as well at two levels (coarse and fine-grained). Further, we leverage this dataset to train a pipeline of four neural models to predict inconsistency type with explanations, given a (claim, context) sentence pair. Explanations include inconsistent claim fact triple, inconsistent context span, inconsistent claim component, coarse and fine-grained inconsistent entity types. The proposed system first predicts inconsistent spans from claim and context; and then uses them to predict inconsistency types and inconsistent entity types (when inconsistency is due to entities). We experiment with multiple Transformer-based natural language classification as well as generative models, and find that DeBERTa performs the best. Our proposed methods provide a weighted F1 of ~87% for inconsistency type classification across the five classes.","PeriodicalId":74091,"journal":{"name":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","volume":"92 1","pages":"410-427"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2306.08872","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Factual consistency is one of the most important requirements when editing high quality documents. It is extremely important for automatic text generation systems like summarization, question answering, dialog modeling, and language modeling. Still, automated factual inconsistency detection is rather under-studied. Existing work has focused on (a) finding fake news keeping a knowledge base in context, or (b) detecting broad contradiction (as part of natural language inference literature). However, there has been no work on detecting and explaining types of factual inconsistencies in text, without any knowledge base in context. In this paper, we leverage existing work in linguistics to formally define five types of factual inconsistencies. Based on this categorization, we contribute a novel dataset, FICLE (Factual Inconsistency CLassification with Explanation), with ~8K samples where each sample consists of two sentences (claim and context) annotated with type and span of inconsistency. When the inconsistency relates to an entity type, it is labeled as well at two levels (coarse and fine-grained). Further, we leverage this dataset to train a pipeline of four neural models to predict inconsistency type with explanations, given a (claim, context) sentence pair. Explanations include inconsistent claim fact triple, inconsistent context span, inconsistent claim component, coarse and fine-grained inconsistent entity types. The proposed system first predicts inconsistent spans from claim and context; and then uses them to predict inconsistency types and inconsistent entity types (when inconsistency is due to entities). We experiment with multiple Transformer-based natural language classification as well as generative models, and find that DeBERTa performs the best. Our proposed methods provide a weighted F1 of ~87% for inconsistency type classification across the five classes.

查看原文本刊更多论文

带有解释的事实不一致分类的神经模型

事实一致性是编辑高质量文档时最重要的要求之一。它对于自动文本生成系统(如摘要、问答、对话建模和语言建模)非常重要。尽管如此，自动化的事实不一致检测还没有得到充分的研究。现有的工作集中在(a)发现假新闻，保持知识库在上下文中，或(b)发现广泛的矛盾(作为自然语言推理文献的一部分)。然而，在没有上下文知识基础的情况下，还没有关于检测和解释文本中事实不一致类型的工作。在本文中，我们利用语言学现有的工作，正式定义五种类型的事实不一致。基于这种分类，我们贡献了一个新的数据集，FICLE(事实不一致分类与解释)，有大约8K个样本，每个样本由两个句子(声明和上下文)组成，并标注了不一致的类型和范围。当不一致性与实体类型相关时，它也被标记为两个级别(粗粒度和细粒度)。此外，我们利用该数据集来训练一个由四个神经模型组成的管道，以预测给定(声明，上下文)句子对的解释不一致类型。解释包括不一致的索赔事实三重、不一致的上下文范围、不一致的索赔组件、粗粒度和细粒度不一致的实体类型。提出的系统首先从权利要求和上下文预测不一致的跨度;然后使用它们来预测不一致类型和不一致实体类型(当不一致是由实体引起的时候)。我们对多个基于transformer的自然语言分类以及生成模型进行了实验，发现DeBERTa表现最好。我们提出的方法为跨五个类的不一致类型分类提供了约87%的加权F1。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Machine learning and knowledge discovery in databases : European Conference, ECML PKDD ... : proceedings. ECML PKDD (Conference)

自引率

0.00%

发文量