Erhe Yang , Fei Hao , Jiaxing Shang , Xiaoliang Chen , Doo-Soon Park
{"title":"BT-CKBQA: An efficient approach for Chinese knowledge base question answering","authors":"Erhe Yang , Fei Hao , Jiaxing Shang , Xiaoliang Chen , Doo-Soon Park","doi":"10.1016/j.datak.2023.102204","DOIUrl":null,"url":null,"abstract":"<div><p>Knowledge Base Question Answering (KBQA), as an increasingly essential application, can provide accurate responses to user queries. ensuring that users obtain relevant information and make decisions promptly. The deep learning-based approaches have achieved satisfactory QA results by leveraging the neural network models. However, these approaches require numerous parameters, which increases the workload of tuning model parameters. To address this problem, we propose BT-CKBQA, a practical and highly efficient approach incorporating <u><strong>B</strong></u>M25 and <u><strong>T</strong></u>emplate-based predicate mapping for <u><strong>CKBQA</strong></u>. Besides, a concept lattice based approach is proposed for summarizing the knowledge base, which can largely improve the execution efficiency of QA with little loss of performance. Concretely, BT-CKBQA leverages the BM25 algorithm and custom dictionary to detect the subject of a question sentence. A template-based predicate generation approach is then proposed to generate candidate predicates. Finally, a ranking approach is provided with the joint consideration of character similarity and semantic similarity for predicate mapping. Extensive experiments are conducted over the NLPCC-ICCPOL 2016 and 2018 KBQA datasets, and the experimental results demonstrate the superiority of the proposed approach over the compared baselines. Particularly, the averaged F1-score result of BT-CKBQA for mention detection is up to 98.25%, which outperforms the best method currently available in the literature. For question answering, the proposed approach achieves superior results than most baselines with the F1-score value of 82.68%. Compared to state-of-the-art baselines, the execution efficiency and performance of QA per unit time can be improved with up to 56.39% and 44.06% gains, respectively. The experimental results for the diversification of questions indicate that the proposed approach performs better for diversified questions than domain-specific questions. The case study over a constructed COVID-19 knowledge base illustrates the effectiveness and practicability of BT-CKBQA.</p></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"147 ","pages":"Article 102204"},"PeriodicalIF":2.7000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X23000642","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Knowledge Base Question Answering (KBQA), as an increasingly essential application, can provide accurate responses to user queries. ensuring that users obtain relevant information and make decisions promptly. The deep learning-based approaches have achieved satisfactory QA results by leveraging the neural network models. However, these approaches require numerous parameters, which increases the workload of tuning model parameters. To address this problem, we propose BT-CKBQA, a practical and highly efficient approach incorporating BM25 and Template-based predicate mapping for CKBQA. Besides, a concept lattice based approach is proposed for summarizing the knowledge base, which can largely improve the execution efficiency of QA with little loss of performance. Concretely, BT-CKBQA leverages the BM25 algorithm and custom dictionary to detect the subject of a question sentence. A template-based predicate generation approach is then proposed to generate candidate predicates. Finally, a ranking approach is provided with the joint consideration of character similarity and semantic similarity for predicate mapping. Extensive experiments are conducted over the NLPCC-ICCPOL 2016 and 2018 KBQA datasets, and the experimental results demonstrate the superiority of the proposed approach over the compared baselines. Particularly, the averaged F1-score result of BT-CKBQA for mention detection is up to 98.25%, which outperforms the best method currently available in the literature. For question answering, the proposed approach achieves superior results than most baselines with the F1-score value of 82.68%. Compared to state-of-the-art baselines, the execution efficiency and performance of QA per unit time can be improved with up to 56.39% and 44.06% gains, respectively. The experimental results for the diversification of questions indicate that the proposed approach performs better for diversified questions than domain-specific questions. The case study over a constructed COVID-19 knowledge base illustrates the effectiveness and practicability of BT-CKBQA.
期刊介绍:
Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.