Dong Li, Jintao Tang, Pancheng Wang, Shasha Li, Ting Wang
{"title":"利用机器阅读最大限度地提高辨别掩蔽能力,实现忠实的问题解答","authors":"Dong Li, Jintao Tang, Pancheng Wang, Shasha Li, Ting Wang","doi":"10.1016/j.ipm.2024.103915","DOIUrl":null,"url":null,"abstract":"<div><div>Despite recent advancements, like Large Language Models (LLMs), in Question Answering with Machine Reading (QAMR), improving the factuality and faithfulness of QAMR models remains a significant challenge. QAMR models require both language knowledge and world knowledge to answer questions. Language knowledge encompasses syntax, semantics, pragmatics, and other language-specific elements. The extent of language knowledge reflects the model’s language understanding capabilities. World knowledge, which refers to people’s cognition of the world, may be parameterized knowledge of the pre-trained language models or textual knowledge of passages. We conduct a comparative study on these two kinds of knowledge and find that language knowledge is stable, while only part of world knowledge is stable and reliable. This motivates us to utilize textual knowledge of passages and avoid parameterized unstable world knowledge of pre-trained language models for QAMR task. To this end, this paper introduces the concept of <em>Answerable without relying on unstable world knowledge external to the passage (AUKE) to determine whether a question can be answered without using parameterized unstable world knowledge of pre-trained language models</em>. We then define <em>evidence</em> as the simplest substring in the passage that supports AUKE. Based on <em>evidence</em>, we introduce a novel faithfulness metric for the QAMR task. We propose a methodology that combines automated processes with manual refinement to augment QAMR datasets with evidence annotations to facilitate faithfulness evaluations. We apply this method to the Chinese QAMR dataset CMRC 2018 and DRCD to extend two datasets that support evidence-based faithfulness evaluation, CMRCFF (CMRC with Faithfulness) and DRCDFF (CMRC with Faithfulness). To alleviate the potential factuality and faithfulness issues induced by unstable world knowledge, we propose a method called Maximizing Discrimination Masking (MDM), which masks the word with the highest degree of distinguishability. MDM is an approximation method designed to circumvent the reliance on parameterized unstable world knowledge embedded within pre-trained language models utilized by QAMR systems. We conduct experiments under the fine-tune setting and few-shot setting on CMRCFF and DRCDFF. The results verify that our MDM approach can effectively improve the factuality and faithfulness of the models.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":null,"pages":null},"PeriodicalIF":7.4000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Maximizing discrimination masking for faithful question answering with machine reading\",\"authors\":\"Dong Li, Jintao Tang, Pancheng Wang, Shasha Li, Ting Wang\",\"doi\":\"10.1016/j.ipm.2024.103915\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Despite recent advancements, like Large Language Models (LLMs), in Question Answering with Machine Reading (QAMR), improving the factuality and faithfulness of QAMR models remains a significant challenge. QAMR models require both language knowledge and world knowledge to answer questions. Language knowledge encompasses syntax, semantics, pragmatics, and other language-specific elements. The extent of language knowledge reflects the model’s language understanding capabilities. World knowledge, which refers to people’s cognition of the world, may be parameterized knowledge of the pre-trained language models or textual knowledge of passages. We conduct a comparative study on these two kinds of knowledge and find that language knowledge is stable, while only part of world knowledge is stable and reliable. This motivates us to utilize textual knowledge of passages and avoid parameterized unstable world knowledge of pre-trained language models for QAMR task. To this end, this paper introduces the concept of <em>Answerable without relying on unstable world knowledge external to the passage (AUKE) to determine whether a question can be answered without using parameterized unstable world knowledge of pre-trained language models</em>. We then define <em>evidence</em> as the simplest substring in the passage that supports AUKE. Based on <em>evidence</em>, we introduce a novel faithfulness metric for the QAMR task. We propose a methodology that combines automated processes with manual refinement to augment QAMR datasets with evidence annotations to facilitate faithfulness evaluations. We apply this method to the Chinese QAMR dataset CMRC 2018 and DRCD to extend two datasets that support evidence-based faithfulness evaluation, CMRCFF (CMRC with Faithfulness) and DRCDFF (CMRC with Faithfulness). To alleviate the potential factuality and faithfulness issues induced by unstable world knowledge, we propose a method called Maximizing Discrimination Masking (MDM), which masks the word with the highest degree of distinguishability. MDM is an approximation method designed to circumvent the reliance on parameterized unstable world knowledge embedded within pre-trained language models utilized by QAMR systems. We conduct experiments under the fine-tune setting and few-shot setting on CMRCFF and DRCDFF. The results verify that our MDM approach can effectively improve the factuality and faithfulness of the models.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457324002747\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324002747","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Maximizing discrimination masking for faithful question answering with machine reading
Despite recent advancements, like Large Language Models (LLMs), in Question Answering with Machine Reading (QAMR), improving the factuality and faithfulness of QAMR models remains a significant challenge. QAMR models require both language knowledge and world knowledge to answer questions. Language knowledge encompasses syntax, semantics, pragmatics, and other language-specific elements. The extent of language knowledge reflects the model’s language understanding capabilities. World knowledge, which refers to people’s cognition of the world, may be parameterized knowledge of the pre-trained language models or textual knowledge of passages. We conduct a comparative study on these two kinds of knowledge and find that language knowledge is stable, while only part of world knowledge is stable and reliable. This motivates us to utilize textual knowledge of passages and avoid parameterized unstable world knowledge of pre-trained language models for QAMR task. To this end, this paper introduces the concept of Answerable without relying on unstable world knowledge external to the passage (AUKE) to determine whether a question can be answered without using parameterized unstable world knowledge of pre-trained language models. We then define evidence as the simplest substring in the passage that supports AUKE. Based on evidence, we introduce a novel faithfulness metric for the QAMR task. We propose a methodology that combines automated processes with manual refinement to augment QAMR datasets with evidence annotations to facilitate faithfulness evaluations. We apply this method to the Chinese QAMR dataset CMRC 2018 and DRCD to extend two datasets that support evidence-based faithfulness evaluation, CMRCFF (CMRC with Faithfulness) and DRCDFF (CMRC with Faithfulness). To alleviate the potential factuality and faithfulness issues induced by unstable world knowledge, we propose a method called Maximizing Discrimination Masking (MDM), which masks the word with the highest degree of distinguishability. MDM is an approximation method designed to circumvent the reliance on parameterized unstable world knowledge embedded within pre-trained language models utilized by QAMR systems. We conduct experiments under the fine-tune setting and few-shot setting on CMRCFF and DRCDFF. The results verify that our MDM approach can effectively improve the factuality and faithfulness of the models.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.