Journal of Web Semantics最新文献

筛选
英文 中文
Improved distant supervision relation extraction based on edge-reasoning hybrid graph model 基于边缘推理混合图模型的改进远程监督关系提取
IF 2.5 3区 计算机科学
Journal of Web Semantics Pub Date : 2021-07-01 DOI: 10.1016/j.websem.2021.100656
Shirong Shen, Shangfu Duan, Huan Gao, Guilin Qi
{"title":"Improved distant supervision relation extraction based on edge-reasoning hybrid graph model","authors":"Shirong Shen,&nbsp;Shangfu Duan,&nbsp;Huan Gao,&nbsp;Guilin Qi","doi":"10.1016/j.websem.2021.100656","DOIUrl":"10.1016/j.websem.2021.100656","url":null,"abstract":"<div><p>Distant supervision relation extraction (DSRE) trains a classifier by automatically labeling data through aligning triples in the knowledge base (KB) with large-scale corpora. Training data generated by distant supervision may contain many mislabeled instances, which is harmful to the training of the classifier. Some recent methods show that relevant background information in KBs, such as entity type (e.g., Organization and Book), can improve the performance of DSRE. However, there are three main problems with these methods. Firstly, these methods are tailored for a specific type of information. A specific type of information only has a positive effect on a part of instances and will not be beneficial to all cases. Secondly, different background information is embedded independently, and no reasonable interaction is achieved. Thirdly, previous methods do not consider the side effect of the introduced noise of background information. To address these issues, we leverage five types of background information instead of a specific type of information in previous works and propose a novel edge-reasoning hybrid graph (ER-HG) model to realize reasonable interaction between different kinds of information. In addition, we further employ an attention mechanism<span> for the ER-HG model to alleviate the side effect of noise. The ER-HG model integrates all types of information efficiently and is very robust to the noise of information. We conduct experiments on two widely used datasets. The experimental results demonstrate that our model outperforms the state-of-the-art methods significantly in held-out metric and robustness tests.</span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.websem.2021.100656","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73405400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Supporting contextualized learning with linked open data 通过关联的开放数据支持情境化学习
IF 2.5 3区 计算机科学
Journal of Web Semantics Pub Date : 2021-07-01 DOI: 10.1016/j.websem.2021.100657
Adolfo Ruiz-Calleja, Guillermo Vega-Gorgojo, Miguel L. Bote-Lorenzo, Juan I. Asensio-Pérez, Yannis Dimitriadis, Eduardo Gómez-Sánchez
{"title":"Supporting contextualized learning with linked open data","authors":"Adolfo Ruiz-Calleja,&nbsp;Guillermo Vega-Gorgojo,&nbsp;Miguel L. Bote-Lorenzo,&nbsp;Juan I. Asensio-Pérez,&nbsp;Yannis Dimitriadis,&nbsp;Eduardo Gómez-Sánchez","doi":"10.1016/j.websem.2021.100657","DOIUrl":"10.1016/j.websem.2021.100657","url":null,"abstract":"<div><p><span><span>This paper proposes a template-based approach to semi-automatically create contextualized learning tasks out of several sources from the Web of Data. The contextualization of learning tasks opens the possibility of bridging formal learning that happens in a classroom, and informal learning that happens in other physical spaces, such as squares or historical buildings. The tasks created cover different cognitive levels and are contextualized by their location and the topics covered. We applied this approach to the domain of History of Art in the Spanish region of Castile and Leon. We gathered data from DBpedia, Wikidata and the Open Data published by the regional government and we applied 32 templates to obtain 16K learning tasks. An evaluation with 8 teachers shows that teachers would accept their students to carry out the </span>tasks generated. Teachers also considered that the 85% of the tasks generated are aligned with the content taught in the classroom and were found to be relevant to learn in other informal spaces. The tasks created are available at </span><span>https://casuallearn.gsic.uva.es/sparql</span><svg><path></path></svg>.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.websem.2021.100657","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81088571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
DTN: Deep triple network for topic specific fake news detection DTN:针对特定主题的假新闻检测的深度三重网络
IF 2.5 3区 计算机科学
Journal of Web Semantics Pub Date : 2021-07-01 DOI: 10.1016/j.websem.2021.100646
Jinshuo Liu , Chenyang Wang , Chenxi Li , Ningxi Li , Juan Deng , Jeff Z. Pan
{"title":"DTN: Deep triple network for topic specific fake news detection","authors":"Jinshuo Liu ,&nbsp;Chenyang Wang ,&nbsp;Chenxi Li ,&nbsp;Ningxi Li ,&nbsp;Juan Deng ,&nbsp;Jeff Z. Pan","doi":"10.1016/j.websem.2021.100646","DOIUrl":"10.1016/j.websem.2021.100646","url":null,"abstract":"<div><p>Detection of fake news has spurred widespread interests in areas such as healthcare and Internet societies, in order to prevent propagating misleading information for commercial and political purposes. However, efforts to study a general framework for exploiting knowledge, for judging the trustworthiness of given news based on their content, have been limited. Indeed, the existing works rarely consider incorporating knowledge graphs (KGs), which could provide rich structured knowledge for better language understanding.</p><p>In this work, we propose a deep triple network (DTN) that leverages knowledge graphs to facilitate fake news detection with triple-enhanced explanations. In the DTN, background knowledge graphs, such as open knowledge graphs and extracted graphs from news bases, are applied for both low-level and high-level feature extraction to classify the input news article and provide explanations for the classification.</p><p>The performance of the proposed method is evaluated by demonstrating abundant convincing comparative experiments. Obtained results show that DTN outperforms conventional fake news detection methods from different aspects, including the provision of factual evidence supporting the decision of fake news detection.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.websem.2021.100646","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76145187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Beware of the hierarchy — An analysis of ontology evolution and the materialisation impact for biomedical ontologies 谨防层次结构-对生物医学本体进化和物质化影响的分析
IF 2.5 3区 计算机科学
Journal of Web Semantics Pub Date : 2021-07-01 DOI: 10.1016/j.websem.2021.100658
Romana Pernisch , Daniele Dell’Aglio , Abraham Bernstein
{"title":"Beware of the hierarchy — An analysis of ontology evolution and the materialisation impact for biomedical ontologies","authors":"Romana Pernisch ,&nbsp;Daniele Dell’Aglio ,&nbsp;Abraham Bernstein","doi":"10.1016/j.websem.2021.100658","DOIUrl":"10.1016/j.websem.2021.100658","url":null,"abstract":"<div><p>Ontologies are becoming a key component of numerous applications and research fields. But knowledge captured within ontologies is not static. Some ontology updates potentially have a wide ranging impact; others only affect very localised parts of the ontology and their applications. Investigating the impact of the evolution gives us insight into the editing behaviour but also signals ontology engineers and users how the ontology evolution is affecting other applications. However, such research is in its infancy. Hence, we need to investigate the evolution itself and its impact on the simplest of applications: the materialisation.</p><p>In this work, we define impact measures that capture the effect of changes on the materialisation. In the future, the impact measures introduced in this work can be used to investigate how aware the ontology editors are about consequences of changes. By introducing five different measures, which focus either on the change in the materialisation with respect to the size or on the number of changes applied, we are able to quantify the consequences of ontology changes. To see these measures in action, we investigate the evolution and its impact on materialisation for nine open biomedical ontologies, most of which adhere to the <span><math><msup><mrow><mi>EL</mi></mrow><mrow><mo>+</mo><mo>+</mo></mrow></msup></math></span> description logic.</p><p>Our results show that these ontologies evolve at varying paces but no statistically significant difference between the ontologies with respect to their evolution could be identified. We identify three types of ontologies based on the types of complex changes which are applied to them throughout their evolution. The impact on the materialisation is the same for the investigated ontologies, bringing us to the conclusion that the effect of changes on the materialisation can be generalised to other similar ontologies. Further, we found that the materialised concept inclusion axioms experience most of the impact induced by changes to the class inheritance of the ontology and other changes only marginally touch the materialisation.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.websem.2021.100658","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85745744","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Entity summarization: State of the art and future challenges 实体总结:技术现状和未来挑战
IF 2.5 3区 计算机科学
Journal of Web Semantics Pub Date : 2021-05-01 DOI: 10.1016/j.websem.2021.100647
Qingxia Liu , Gong Cheng , Kalpa Gunaratna , Yuzhong Qu
{"title":"Entity summarization: State of the art and future challenges","authors":"Qingxia Liu ,&nbsp;Gong Cheng ,&nbsp;Kalpa Gunaratna ,&nbsp;Yuzhong Qu","doi":"10.1016/j.websem.2021.100647","DOIUrl":"10.1016/j.websem.2021.100647","url":null,"abstract":"<div><p><span>The increasing availability of semantic data has substantially enhanced Web applications. Semantic data such as RDF data is commonly represented as entity-property-value triples. The magnitude of semantic data, in particular the large number of triples describing an entity, could overload users with excessive amounts of information. This has motivated fruitful research on automated generation of summaries for entity descriptions to satisfy users’ information needs efficiently and effectively. We focus on this prominent topic of entity summarization, and our research objective is to present the first comprehensive survey of entity summarization research. Rather than separately reviewing each method, our contributions include (1) identifying and classifying technical features of existing methods to form a high-level overview, (2) identifying and classifying frameworks for combining multiple technical features adopted by existing methods, (3) collecting known benchmarks for intrinsic evaluation and efforts for extrinsic evaluation, and (4) suggesting research directions for future work. By investigating the literature, we synthesized two hierarchies of techniques. The first hierarchy categories generic technical features into several perspectives: frequency and centrality, informativeness, and diversity and coverage. In the second hierarchy we present domain-specific and task-specific technical features, including the use of domain knowledge, </span>context awareness<span><span><span>, and personalization. Our review demonstrated that existing methods are mainly unsupervised and they combine multiple technical features using various frameworks: random surfer models, similarity-based grouping, MMR-like re-ranking, or combinatorial optimization. We also found a few </span>deep learning based methods in recent research. Current evaluation results and our case study showed that the problem of entity summarization is still far from being solved. Based on the limitations of existing methods revealed in the review, we identified several future directions: the use of semantics, </span>human factors, machine and deep learning, non-extractive methods, and interactive methods.</span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2021-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.websem.2021.100647","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77050790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Handling redundant processing in OBDA query execution over relational sources 处理关系源上OBDA查询执行中的冗余处理
IF 2.5 3区 计算机科学
Journal of Web Semantics Pub Date : 2021-04-01 DOI: 10.1016/j.websem.2021.100639
Dimitris Bilidas, Manolis Koubarakis
{"title":"Handling redundant processing in OBDA query execution over relational sources","authors":"Dimitris Bilidas,&nbsp;Manolis Koubarakis","doi":"10.1016/j.websem.2021.100639","DOIUrl":"10.1016/j.websem.2021.100639","url":null,"abstract":"<div><p><span><span>Redundant processing is a key problem in the translation of initial queries posed over an ontology into SQL queries, through mappings, as it is performed by ontology-based data access systems. Examples of such processing are duplicate answers obtained during query evaluation, which must finally be discarded, or common expressions evaluated multiple times from different parts of the same complex query. Many optimizations that aim to minimize this problem have been proposed and implemented, mostly based on semantic </span>query optimization techniques, by exploiting ontological axioms and constraints defined in the database schema. However, data operations that introduce redundant processing are still generated in many practical settings, and this is a factor that impacts </span>query execution<span><span>. In this work we propose a cost-based method for query translation, which starts from an initial result and uses information about redundant processing in order to come up with an equivalent, more efficient translation. The method operates in a number of steps, by relying on certain heuristics indicating that we obtain a more efficient query in each step. Through experimental evaluation using the </span>Ontop system for ontology-based data access, we exhibit the benefits of our method.</span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.websem.2021.100639","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72667523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
FarsBase-KBP: A knowledge base population system for the Persian Knowledge Graph FarsBase-KBP:波斯语知识图谱的知识库人口系统
IF 2.5 3区 计算机科学
Journal of Web Semantics Pub Date : 2021-04-01 DOI: 10.1016/j.websem.2021.100638
Majid Asgari-Bidhendi, Behrooz Janfada, Behrouz Minaei-Bidgoli
{"title":"FarsBase-KBP: A knowledge base population system for the Persian Knowledge Graph","authors":"Majid Asgari-Bidhendi,&nbsp;Behrooz Janfada,&nbsp;Behrouz Minaei-Bidgoli","doi":"10.1016/j.websem.2021.100638","DOIUrl":"10.1016/j.websem.2021.100638","url":null,"abstract":"<div><p>While most of the knowledge bases already support the English language, there is only one knowledge base for the Persian language, known as FarsBase, which is automatically created via semi-structured web information. Unlike English knowledge bases such as Wikidata, which have tremendous community support, the population of a knowledge base like FarsBase must rely on automatically extracted knowledge. Knowledge base population can let FarsBase keep growing in size, as the system continues working. In this paper, we present a knowledge base population system for the Persian language, which extracts knowledge from unlabelled raw text, crawled from the Web. The proposed system consists of a set of state-of-the-art modules such as an entity linking module as well as information and relation extraction modules designed for FarsBase. Moreover, a canonicalization system is introduced to link extracted relations to FarsBase properties. Then, the system uses knowledge fusion techniques with minimal intervention of human experts to integrate and filter the proper knowledge instances, extracted by each module. To evaluate the performance of the presented knowledge base population system, we present the first gold dataset for benchmarking knowledge base population in the Persian language, which consisting of 22015 FarsBase triples and verified by human experts. The evaluation results demonstrate the efficiency of the proposed system.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2021-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.websem.2021.100638","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76836239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Knowledge graph embeddings for dealing with concept drift in machine learning 用于处理机器学习中概念漂移的知识图嵌入
IF 2.5 3区 计算机科学
Journal of Web Semantics Pub Date : 2021-02-01 DOI: 10.1016/j.websem.2020.100625
Jiaoyan Chen , Freddy Lécué , Jeff Z. Pan , Shumin Deng , Huajun Chen
{"title":"Knowledge graph embeddings for dealing with concept drift in machine learning","authors":"Jiaoyan Chen ,&nbsp;Freddy Lécué ,&nbsp;Jeff Z. Pan ,&nbsp;Shumin Deng ,&nbsp;Huajun Chen","doi":"10.1016/j.websem.2020.100625","DOIUrl":"10.1016/j.websem.2020.100625","url":null,"abstract":"<div><p>Data stream learning has been largely studied for extracting knowledge structures from continuous and rapid data records. As data is evolving on a temporal basis, its underlying knowledge is subject to many challenges. Concept drift,<span><sup>1</sup></span><span> as one of core challenge from the stream learning community, is described as changes of statistical properties of the data over time, causing most of machine learning models to be less accurate as changes over time are in unforeseen ways. This is particularly problematic as the evolution of data could derive to dramatic change in knowledge. We address this problem by studying the semantic representation<span> of data streams in the Semantic Web, i.e., ontology streams. Such streams are ordered sequences of data annotated with ontological vocabulary. In particular we exploit three levels of knowledge encoded in ontology streams to deal with concept drifts: i) existence of novel knowledge gained from stream dynamics, ii) significance of knowledge change and evolution, and iii) (in)consistency of knowledge evolution. Such knowledge is encoded as knowledge graph embeddings through a combination of novel representations: entailment vectors, entailment weights, and a consistency vector. We illustrate our approach on classification tasks of supervised learning. Key contributions of the study include: </span></span><em>(i)</em> an effective knowledge graph embedding approach for stream ontologies, and <em>(ii)</em> a generic consistent prediction framework with integrated knowledge graph embeddings for dealing with concept drifts. The experiments have shown that our approach provides accurate predictions towards air quality in Beijing and bus delay in Dublin with real world ontology streams.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2021-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.websem.2020.100625","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73302346","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
On revealing shared conceptualization among open datasets 开放数据集之间共享概念的揭示
IF 2.5 3区 计算机科学
Journal of Web Semantics Pub Date : 2021-01-01 DOI: 10.1016/j.websem.2020.100624
Miloš Bogdanović, Nataša Veljković, Milena Frtunić Gligorijević, Darko Puflović, Leonid Stoimenov
{"title":"On revealing shared conceptualization among open datasets","authors":"Miloš Bogdanović,&nbsp;Nataša Veljković,&nbsp;Milena Frtunić Gligorijević,&nbsp;Darko Puflović,&nbsp;Leonid Stoimenov","doi":"10.1016/j.websem.2020.100624","DOIUrl":"10.1016/j.websem.2020.100624","url":null,"abstract":"<div><p><span>Openness and transparency initiatives are not only milestones of science progress but have also influenced various fields of organization and industry. Under this influence, varieties of government institutions worldwide have published a large number of datasets through open data<span><span> portals. Government data covers diverse subjects and the scale of available data is growing every year. Published data is expected to be both accessible and discoverable. For these purposes, portals take advantage of metadata accompanying datasets. However, a part of metadata is often missing which decreases users’ ability to obtain the desired information. As the scale of published datasets grows, this problem increases. An approach we describe in this paper is focused towards decreasing this problem by implementing knowledge structures and algorithms capable of proposing the best match for the category where an uncategorized dataset should belong to. By doing so, our aim is twofold: enrich datasets metadata by suggesting an appropriate category and increase its visibility and discoverability. Our approach relies on information regarding open datasets provided by users — dataset description contained within dataset tags. Since dataset tags express low consistency due to their origin, in this paper we will present a method of optimizing their usage through means of </span>semantic similarity measures based on </span></span>natural language processing<span> mechanisms. Optimization is performed in terms of reducing the number of distinct tag values used for dataset description. Once optimized, dataset tags are used to reveal shared conceptualization originating from their usage by means of Formal Concept Analysis. We will demonstrate the advantage of our proposal by comparing concept lattices generated using Formal Concept Analysis before and after the optimization process and use generated structure as a knowledge base to categorize uncategorized open datasets. Finally, we will present a categorization mechanism based on the generated knowledge base that takes advantage of semantic similarity measures to propose a category suitable for an uncategorized dataset.</span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72571636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Less is more: Data-efficient complex question answering over knowledge bases 少即是多:基于知识库的数据高效复杂问题回答
IF 2.5 3区 计算机科学
Journal of Web Semantics Pub Date : 2020-12-01 DOI: 10.1016/j.websem.2020.100612
Yuncheng Hua , Yuan-Fang Li , Guilin Qi , Wei Wu , Jingyao Zhang , Daiqing Qi
{"title":"Less is more: Data-efficient complex question answering over knowledge bases","authors":"Yuncheng Hua ,&nbsp;Yuan-Fang Li ,&nbsp;Guilin Qi ,&nbsp;Wei Wu ,&nbsp;Jingyao Zhang ,&nbsp;Daiqing Qi","doi":"10.1016/j.websem.2020.100612","DOIUrl":"10.1016/j.websem.2020.100612","url":null,"abstract":"<div><p><span>Question answering is an effective method for obtaining information from knowledge bases (KB). In this paper, we propose the Neural-Symbolic Complex Question Answering (NS-CQA) model, a data-efficient reinforcement learning framework for complex question answering by using only a modest number of training samples. Our framework consists of a neural </span><em>generator</em> and a symbolic <em>executor</em><span><span><span> that, respectively, transforms a natural-language question into a sequence of primitive actions, and executes them over the knowledge base to compute the answer. We carefully formulate a set of primitive symbolic actions that allows us to not only simplify our </span>neural network design but also accelerate model convergence. To reduce search space, we employ the copy and masking mechanisms in our encoder–decoder architecture to drastically reduce the decoder output vocabulary and improve model </span>generalizability<span>. We equip our model with a memory buffer that stores high-reward promising programs. Besides, we propose an adaptive reward function. By comparing the generated trial with the trials stored in the memory buffer, we derive the curriculum-guided reward bonus, i.e., the proximity and the novelty. To mitigate the sparse reward problem, we combine the adaptive reward and the reward bonus, reshaping the sparse reward into dense feedback. Also, we encourage the model to generate new trials to avoid imitating the spurious trials while making the model remember the past high-reward trials to improve data efficiency. Our NS-CQA model is evaluated on two datasets: CQA, a recent large-scale complex question answering dataset, and WebQuestionsSP, a multi-hop question answering dataset. On both datasets, our model outperforms the state-of-the-art models. Notably, on CQA, NS-CQA performs well on questions with higher complexity, while only using approximately 1% of the total training samples.</span></span></p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":null,"pages":null},"PeriodicalIF":2.5,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/j.websem.2020.100612","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84110720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信