利用机器阅读最大限度地提高辨别掩蔽能力，实现忠实的问题解答

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management Pub Date : 2024-10-09 DOI:10.1016/j.ipm.2024.103915

Dong Li, Jintao Tang, Pancheng Wang, Shasha Li, Ting Wang

{"title":"利用机器阅读最大限度地提高辨别掩蔽能力，实现忠实的问题解答","authors":"Dong Li, Jintao Tang, Pancheng Wang, Shasha Li, Ting Wang","doi":"10.1016/j.ipm.2024.103915","DOIUrl":null,"url":null,"abstract":"<div><div>Despite recent advancements, like Large Language Models (LLMs), in Question Answering with Machine Reading (QAMR), improving the factuality and faithfulness of QAMR models remains a significant challenge. QAMR models require both language knowledge and world knowledge to answer questions. Language knowledge encompasses syntax, semantics, pragmatics, and other language-specific elements. The extent of language knowledge reflects the model’s language understanding capabilities. World knowledge, which refers to people’s cognition of the world, may be parameterized knowledge of the pre-trained language models or textual knowledge of passages. We conduct a comparative study on these two kinds of knowledge and find that language knowledge is stable, while only part of world knowledge is stable and reliable. This motivates us to utilize textual knowledge of passages and avoid parameterized unstable world knowledge of pre-trained language models for QAMR task. To this end, this paper introduces the concept of <em>Answerable without relying on unstable world knowledge external to the passage (AUKE) to determine whether a question can be answered without using parameterized unstable world knowledge of pre-trained language models</em>. We then define <em>evidence</em> as the simplest substring in the passage that supports AUKE. Based on <em>evidence</em>, we introduce a novel faithfulness metric for the QAMR task. We propose a methodology that combines automated processes with manual refinement to augment QAMR datasets with evidence annotations to facilitate faithfulness evaluations. We apply this method to the Chinese QAMR dataset CMRC 2018 and DRCD to extend two datasets that support evidence-based faithfulness evaluation, CMRCFF (CMRC with Faithfulness) and DRCDFF (CMRC with Faithfulness). To alleviate the potential factuality and faithfulness issues induced by unstable world knowledge, we propose a method called Maximizing Discrimination Masking (MDM), which masks the word with the highest degree of distinguishability. MDM is an approximation method designed to circumvent the reliance on parameterized unstable world knowledge embedded within pre-trained language models utilized by QAMR systems. We conduct experiments under the fine-tune setting and few-shot setting on CMRCFF and DRCDFF. The results verify that our MDM approach can effectively improve the factuality and faithfulness of the models.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103915"},"PeriodicalIF":7.4000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Maximizing discrimination masking for faithful question answering with machine reading\",\"authors\":\"Dong Li, Jintao Tang, Pancheng Wang, Shasha Li, Ting Wang\",\"doi\":\"10.1016/j.ipm.2024.103915\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Despite recent advancements, like Large Language Models (LLMs), in Question Answering with Machine Reading (QAMR), improving the factuality and faithfulness of QAMR models remains a significant challenge. QAMR models require both language knowledge and world knowledge to answer questions. Language knowledge encompasses syntax, semantics, pragmatics, and other language-specific elements. The extent of language knowledge reflects the model’s language understanding capabilities. World knowledge, which refers to people’s cognition of the world, may be parameterized knowledge of the pre-trained language models or textual knowledge of passages. We conduct a comparative study on these two kinds of knowledge and find that language knowledge is stable, while only part of world knowledge is stable and reliable. This motivates us to utilize textual knowledge of passages and avoid parameterized unstable world knowledge of pre-trained language models for QAMR task. To this end, this paper introduces the concept of <em>Answerable without relying on unstable world knowledge external to the passage (AUKE) to determine whether a question can be answered without using parameterized unstable world knowledge of pre-trained language models</em>. We then define <em>evidence</em> as the simplest substring in the passage that supports AUKE. Based on <em>evidence</em>, we introduce a novel faithfulness metric for the QAMR task. We propose a methodology that combines automated processes with manual refinement to augment QAMR datasets with evidence annotations to facilitate faithfulness evaluations. We apply this method to the Chinese QAMR dataset CMRC 2018 and DRCD to extend two datasets that support evidence-based faithfulness evaluation, CMRCFF (CMRC with Faithfulness) and DRCDFF (CMRC with Faithfulness). To alleviate the potential factuality and faithfulness issues induced by unstable world knowledge, we propose a method called Maximizing Discrimination Masking (MDM), which masks the word with the highest degree of distinguishability. MDM is an approximation method designed to circumvent the reliance on parameterized unstable world knowledge embedded within pre-trained language models utilized by QAMR systems. We conduct experiments under the fine-tune setting and few-shot setting on CMRCFF and DRCDFF. The results verify that our MDM approach can effectively improve the factuality and faithfulness of the models.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"62 1\",\"pages\":\"Article 103915\"},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457324002747\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324002747","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

尽管机器阅读问题解答（QAMR）最近取得了一些进展，如大型语言模型（LLM），但提高 QAMR 模型的事实性和忠实性仍然是一项重大挑战。QAMR 模型需要语言知识和世界知识才能回答问题。语言知识包括语法、语义、语用和其他特定语言元素。语言知识的范围反映了模型的语言理解能力。世界知识指的是人们对世界的认知，可以是预训练语言模型的参数化知识，也可以是段落的文本知识。我们对这两种知识进行了比较研究，发现语言知识是稳定的，而世界知识只有一部分是稳定可靠的。这促使我们在 QAMR 任务中利用段落文本知识，避免使用参数化的不稳定的预训练语言模型世界知识。为此，本文引入了 "不依赖段落外部不稳定世界知识的可回答性"（AUKE）的概念，以确定一个问题是否可以回答，而无需使用预训练语言模型的参数化不稳定世界知识。然后，我们将证据定义为段落中支持 AUKE 的最简单子串。在证据的基础上，我们为 QAMR 任务引入了一种新的忠实度指标。我们提出了一种方法，将自动处理与人工改进相结合，用证据注释来增强 QAMR 数据集，从而促进忠实度评估。我们将该方法应用于中国 QAMR 数据集 CMRC 2018 和 DRCD，以扩展两个支持基于证据的忠实度评估的数据集 CMRCFF（CMRC with Faithfulness）和 DRCDFF（CMRC with Faithfulness）。为了缓解由不稳定的世界知识引起的潜在事实性和忠实性问题，我们提出了一种称为最大化辨别掩蔽（MDM）的方法，它可以掩蔽可辨别程度最高的单词。MDM 是一种近似方法，旨在规避 QAMR 系统使用的预训练语言模型中嵌入的参数化不稳定世界知识的依赖性。我们在 CMRCFF 和 DRCDFF 上进行了微调设置和少拍设置实验。结果验证了我们的 MDM 方法可以有效提高模型的真实性和忠实性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Maximizing discrimination masking for faithful question answering with machine reading

Despite recent advancements, like Large Language Models (LLMs), in Question Answering with Machine Reading (QAMR), improving the factuality and faithfulness of QAMR models remains a significant challenge. QAMR models require both language knowledge and world knowledge to answer questions. Language knowledge encompasses syntax, semantics, pragmatics, and other language-specific elements. The extent of language knowledge reflects the model’s language understanding capabilities. World knowledge, which refers to people’s cognition of the world, may be parameterized knowledge of the pre-trained language models or textual knowledge of passages. We conduct a comparative study on these two kinds of knowledge and find that language knowledge is stable, while only part of world knowledge is stable and reliable. This motivates us to utilize textual knowledge of passages and avoid parameterized unstable world knowledge of pre-trained language models for QAMR task. To this end, this paper introduces the concept of Answerable without relying on unstable world knowledge external to the passage (AUKE) to determine whether a question can be answered without using parameterized unstable world knowledge of pre-trained language models. We then define evidence as the simplest substring in the passage that supports AUKE. Based on evidence, we introduce a novel faithfulness metric for the QAMR task. We propose a methodology that combines automated processes with manual refinement to augment QAMR datasets with evidence annotations to facilitate faithfulness evaluations. We apply this method to the Chinese QAMR dataset CMRC 2018 and DRCD to extend two datasets that support evidence-based faithfulness evaluation, CMRCFF (CMRC with Faithfulness) and DRCDFF (CMRC with Faithfulness). To alleviate the potential factuality and faithfulness issues induced by unstable world knowledge, we propose a method called Maximizing Discrimination Masking (MDM), which masks the word with the highest degree of distinguishability. MDM is an approximation method designed to circumvent the reliance on parameterized unstable world knowledge embedded within pre-trained language models utilized by QAMR systems. We conduct experiments under the fine-tune setting and few-shot setting on CMRCFF and DRCDFF. The results verify that our MDM approach can effectively improve the factuality and faithfulness of the models.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Processing & Management 工程技术-计算机：信息系统

CiteScore

17.00

自引率

11.60%

发文量

276

审稿时长

39 days

期刊介绍： Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.