Multi-hop commonsense knowledge injection framework for zero-shot commonsense question answering

IF 7.5 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Expert Systems with Applications Pub Date : 2025-09-24 DOI:10.1016/j.eswa.2025.129806

Xin Guan , Jiuxin Cao , Biwei Cao , Qingqing Gao , Bo Liu

{"title":"Multi-hop commonsense knowledge injection framework for zero-shot commonsense question answering","authors":"Xin Guan , Jiuxin Cao , Biwei Cao , Qingqing Gao , Bo Liu","doi":"10.1016/j.eswa.2025.129806","DOIUrl":null,"url":null,"abstract":"<div><div>Zero-shot commonsense question answering (QA) task is to evaluate the general reasoning ability of the language model without training on the specific datasets. The existing zero-shot framework transforms triples within the commonsense knowledge graphs (KGs) into QA-format samples, serving as a pre-training data source to integrate commonsense knowledge into the language model. However, this approach still faces the following challenges: 1) The model trained from synthetic QA generated from triples lacks the multi-hop commonsense knowledge required for handling complex QA problems. 2) Ambiguity caused by confusing commonsense knowledge within synthetic QA, making it challenging for models to discern semantically similar entities. To address the above problem, we propose a novel <strong>M</strong>ulti-hop <strong>C</strong>ommonsense <strong>K</strong>nowledge <strong>I</strong>njection Framework (MCKI). Specifically, we draw inspiration from human complex reasoning thinking and further propose a synthetic multi-hop commonsense QA generation method. Meanwhile, we introduce negative samples with high confusion in synthetic QA, and then use contrastive learning to improve the model’s ability to distinguish similar commonsense knowledge. Extensive experiments on five commonsense question answering benchmarks demonstrate that our framework achieves state-of-the-art performance, surpassing existing methods, including large language models like GPT3.5 and ChatGPT.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"298 ","pages":"Article 129806"},"PeriodicalIF":7.5000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425034219","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Zero-shot commonsense question answering (QA) task is to evaluate the general reasoning ability of the language model without training on the specific datasets. The existing zero-shot framework transforms triples within the commonsense knowledge graphs (KGs) into QA-format samples, serving as a pre-training data source to integrate commonsense knowledge into the language model. However, this approach still faces the following challenges: 1) The model trained from synthetic QA generated from triples lacks the multi-hop commonsense knowledge required for handling complex QA problems. 2) Ambiguity caused by confusing commonsense knowledge within synthetic QA, making it challenging for models to discern semantically similar entities. To address the above problem, we propose a novel Multi-hop Commonsense Knowledge Injection Framework (MCKI). Specifically, we draw inspiration from human complex reasoning thinking and further propose a synthetic multi-hop commonsense QA generation method. Meanwhile, we introduce negative samples with high confusion in synthetic QA, and then use contrastive learning to improve the model’s ability to distinguish similar commonsense knowledge. Extensive experiments on five commonsense question answering benchmarks demonstrate that our framework achieves state-of-the-art performance, surpassing existing methods, including large language models like GPT3.5 and ChatGPT.

查看原文本刊更多论文

零射击常识问答的多跳常识知识注入框架

零射击常识性问答（Zero-shot commonsense question answer， QA）任务是在没有特定数据集训练的情况下评估语言模型的一般推理能力。现有的zero-shot框架将常识图中的三元组转换为qa格式的样本，作为预训练数据源，将常识知识集成到语言模型中。然而，这种方法仍然面临以下挑战：1)由三元组生成的合成QA训练的模型缺乏处理复杂QA问题所需的多跳常识知识。2)在合成QA中混淆常识性知识导致的模糊性，使模型难以识别语义相似的实体。为了解决上述问题，我们提出了一种新的多跳常识注入框架（MCKI）。具体而言，我们从人类复杂推理思维中汲取灵感，进一步提出了一种综合的多跳常识QA生成方法。同时，我们在合成质量保证中引入高混淆的负样本，然后利用对比学习来提高模型区分相似常识的能力。在五个常识性问题回答基准测试上进行的大量实验表明，我们的框架实现了最先进的性能，超越了现有的方法，包括GPT3.5和ChatGPT等大型语言模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Expert Systems with Applications 工程技术-工程：电子与电气

CiteScore

13.80

自引率

10.60%

发文量

2045

审稿时长

8.7 months

期刊介绍： Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.