Soumajit Pramanik , Jesujoba Alabi , Rishiraj Saha Roy , Gerhard Weikum
{"title":"Uniqorn: Unified question answering over RDF knowledge graphs and natural language text","authors":"Soumajit Pramanik , Jesujoba Alabi , Rishiraj Saha Roy , Gerhard Weikum","doi":"10.1016/j.websem.2024.100833","DOIUrl":"10.1016/j.websem.2024.100833","url":null,"abstract":"<div><p>Question answering over RDF data like knowledge graphs has been greatly advanced, with a number of good systems providing crisp answers for natural language questions or telegraphic queries. Some of these systems incorporate textual sources as additional evidence for the answering process, but cannot compute answers that are present in text alone. Conversely, the IR and NLP communities have addressed QA over text, but such systems barely utilize semantic data and knowledge. This paper presents a method for <em>complex questions</em> that can seamlessly operate over a mixture of RDF datasets and text corpora, or individual sources, in a unified framework. Our method, called <span>Uniqorn</span>, builds a context graph on-the-fly, by retrieving question-relevant evidences from the RDF data and/or a text corpus, using fine-tuned BERT models. The resulting graph typically contains all question-relevant evidences but also a lot of noise. <span>Uniqorn</span> copes with this input by a graph algorithm for Group Steiner Trees, that identifies the best answer candidates in the context graph. Experimental results on several benchmarks of complex questions with multiple entities and relations, show that <span>Uniqorn</span> significantly outperforms state-of-the-art methods for <em>heterogeneous QA</em> – in a full training mode, as well as in zero-shot settings. The graph-based methodology provides user-interpretable evidence for the complete answering process.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"83 ","pages":"Article 100833"},"PeriodicalIF":2.1,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000192/pdfft?md5=1b3a7cdd704527ca28fe0609b32bbd44&pid=1-s2.0-S1570826824000192-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142238245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"KAE: A property-based method for knowledge graph alignment and extension","authors":"Daqian Shi, Xiaoyue Li, Fausto Giunchiglia","doi":"10.1016/j.websem.2024.100832","DOIUrl":"10.1016/j.websem.2024.100832","url":null,"abstract":"<div><p>A common solution to the semantic heterogeneity problem is to perform knowledge graph (KG) extension exploiting the information encoded in one or more candidate KGs, where the alignment between the reference KG and candidate KGs is considered the critical procedure. However, existing KG alignment methods mainly rely on entity type (etype) label matching as a prerequisite, which is poorly performing in practice or not applicable in some cases. In this paper, we design a machine learning-based framework for KG extension, including an alternative novel property-based alignment approach that allows aligning etypes on the basis of the properties used to define them. The main intuition is that it is properties that intentionally define the etype, and this definition is independent of the specific label used to name an etype, and of the specific hierarchical schema of KGs. Compared with the state-of-the-art, the experimental results show the validity of the KG alignment approach and the superiority of the proposed KG extension framework, both quantitatively and qualitatively.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"82 ","pages":"Article 100832"},"PeriodicalIF":2.1,"publicationDate":"2024-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000180/pdfft?md5=0e32d6cca795e8742e917608eef1c323&pid=1-s2.0-S1570826824000180-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141692104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multi-stream graph attention network for recommendation with knowledge graph","authors":"Zhifei Hu , Feng Xia","doi":"10.1016/j.websem.2024.100831","DOIUrl":"10.1016/j.websem.2024.100831","url":null,"abstract":"<div><p>In recent years, the powerful modeling ability of Graph Neural Networks (GNNs) has led to their widespread use in knowledge-aware recommender systems. However, existing GNN-based methods for information propagation among entities in knowledge graphs (KGs) may not efficiently filter out less informative entities. To address this challenge and improve the encoding of high-order structure information among many entities, we propose an end-to-end neural network-based method called Multi-stream Graph Attention Network (MSGAT). MSGAT explicitly discriminates the importance of entities from four critical perspectives and recursively propagates neighbor embeddings to refine the target node. Specifically, we use an attention mechanism from the user's perspective to distill the domain nodes' information of the predicted item in the KG, enhance the user's information on items, and generate the feature representation of the predicted item. We also propose a multi-stream attention mechanism to aggregate user history click item's neighborhood entity information in the KG and generate the user's feature representation. We conduct extensive experiments on three real datasets for movies, music, and books, and the empirical results demonstrate that MSGAT outperforms current state-of-the-art baselines.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"82 ","pages":"Article 100831"},"PeriodicalIF":2.1,"publicationDate":"2024-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000179/pdfft?md5=b3464b8bed3c0ac35eee561e19ca6a2a&pid=1-s2.0-S1570826824000179-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141960757","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cogan Shimizu , Andrew Eells , Seila Gonzalez , Lu Zhou , Pascal Hitzler , Alicia Sheill , Catherine Foley , Dean Rehberger
{"title":"Ontology design facilitating Wikibase integration — and a worked example for historical data","authors":"Cogan Shimizu , Andrew Eells , Seila Gonzalez , Lu Zhou , Pascal Hitzler , Alicia Sheill , Catherine Foley , Dean Rehberger","doi":"10.1016/j.websem.2024.100823","DOIUrl":"https://doi.org/10.1016/j.websem.2024.100823","url":null,"abstract":"<div><p>Wikibase – which is the software underlying Wikidata – is a powerful platform for knowledge graph creation and management. However, it has been developed with a crowd-sourced knowledge graph creation scenario in mind, which in particular means that it has not been designed for use case scenarios in which a tightly controlled high-quality schema, in the form of an ontology, is to be imposed, and indeed, independently developed ontologies do not necessarily map seamlessly to the Wikibase approach. In this paper, we provide the key ingredients needed in order to combine traditional ontology modeling with use of the Wikibase platform, namely a set of <em>axiom</em> patterns that bridge the paradigm gap, together with usage instructions and a worked example for historical data.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"82 ","pages":"Article 100823"},"PeriodicalIF":2.1,"publicationDate":"2024-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S157082682400009X/pdfft?md5=f2d0e2fffb17f5e6856c8379d489136d&pid=1-s2.0-S157082682400009X-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141481906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Web3-DAO: An ontology for decentralized autonomous organizations","authors":"María-Cruz Valiente, Juan Pavón","doi":"10.1016/j.websem.2024.100830","DOIUrl":"10.1016/j.websem.2024.100830","url":null,"abstract":"<div><p>Decentralized autonomous organizations (DAOs) are relatively a newly emerging type of online entity related to governance or business models where all their members work together and participate in the decision-making processes affecting the DAO in a decentralized, collective, fair, and democratic manner. In a DAO, members interaction is mediated by software agents running on a blockchain that encode the governance of the specific entity in terms of rules that optimize their business and goals. In this context, most popular DAO software frameworks provide decision-making models aiming to facilitate digital governance and the collaboration among their members intertwining social and economic concerns. However, these models are complex, not interoperable among them and lack a common understanding and shared knowledge concerning DAOs, as well as the computational semantics needed to enable automated validation, simulation or execution. Thus, this paper presents an ontology (Web3-DAO), which can support machine-readable digital governance of DAOs adding semantics to their decision-making models. The proposed ontology captures the domain logic that allows the sharing of updated information and decisions for all the members that interact with a DAO by the interoperability of their own assessment and decision tools. Furthermore, the ontology detects semantic ambiguities, uncertainties and contradictions. The Web3-DAO ontology is available in open access at <span>https://github.com/Grasia/semantic-web3-dao</span><svg><path></path></svg>.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"82 ","pages":"Article 100830"},"PeriodicalIF":2.1,"publicationDate":"2024-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000167/pdfft?md5=50e4d4b40c93103a13362ae80c817a36&pid=1-s2.0-S1570826824000167-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141411685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jinfa Yang, Xianghua Ying, Yongjie Shi, Ruibin Wang
{"title":"Improving static and temporal knowledge graph embedding using affine transformations of entities","authors":"Jinfa Yang, Xianghua Ying, Yongjie Shi, Ruibin Wang","doi":"10.1016/j.websem.2024.100824","DOIUrl":"https://doi.org/10.1016/j.websem.2024.100824","url":null,"abstract":"<div><p>To find a suitable embedding for a knowledge graph (KG) remains a big challenge nowadays. By measuring the distance or plausibility of triples and quadruples in static and temporal knowledge graphs, many reliable knowledge graph embedding (KGE) models are proposed. However, these classical models may not be able to represent and infer various relation patterns well, such as TransE cannot represent symmetric relations, DistMult cannot represent inverse relations, RotatE cannot represent multiple relations, <em>etc</em>.. In this paper, we improve the ability of these models to represent various relation patterns by introducing the affine transformation framework. Specifically, we first utilize a set of affine transformations related to each relation or timestamp to operate on entity vectors, and then these transformed vectors can be applied not only to static KGE models, but also to temporal KGE models. The main advantage of using affine transformations is their good geometry properties with interpretability. Our experimental results demonstrate that the proposed intuitive design with affine transformations provides a statistically significant increase in performance with adding a few extra processing steps and keeping the same number of embedding parameters. Taking TransE as an example, we employ the scale transformation (the special case of an affine transformation). Surprisingly, it even outperforms RotatE to some extent on various datasets. We also introduce affine transformations into RotatE, Distmult, ComplEx, TTransE and TComplEx respectively, and experiments demonstrate that affine transformations consistently and significantly improve the performance of state-of-the-art KGE models on both static and temporal knowledge graph benchmarks.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"82 ","pages":"Article 100824"},"PeriodicalIF":2.5,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000106/pdfft?md5=c556da96eab16cdef47d1fff590e4a7d&pid=1-s2.0-S1570826824000106-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141324691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fiorela Ciroku , Jacopo de Berardinis , Jongmo Kim , Albert Meroño-Peñuela , Valentina Presutti , Elena Simperl
{"title":"RevOnt: Reverse engineering of competency questions from knowledge graphs via language models","authors":"Fiorela Ciroku , Jacopo de Berardinis , Jongmo Kim , Albert Meroño-Peñuela , Valentina Presutti , Elena Simperl","doi":"10.1016/j.websem.2024.100822","DOIUrl":"10.1016/j.websem.2024.100822","url":null,"abstract":"<div><p>The process of developing ontologies – a formal, explicit specification of a shared conceptualisation – is addressed by well-known methodologies. As for any engineering development, its fundamental basis is the collection of requirements, which includes the elicitation of competency questions. Competency questions are defined through interacting with domain and application experts or by investigating existing datasets that may be used to populate the ontology i.e. its knowledge graph. The rise in popularity and accessibility of knowledge graphs provides an opportunity to support this phase with automatic tools. In this work, we explore the possibility of extracting competency questions from a knowledge graph. This reverses the traditional workflow in which knowledge graphs are built from ontologies, which in turn are engineered from competency questions. We describe in detail RevOnt, an approach that extracts and abstracts triples from a knowledge graph, generates questions based on triple verbalisations, and filters the resulting questions to yield a meaningful set of competency questions; the WDV dataset. This approach is implemented utilising the Wikidata knowledge graph as a use case, and contributes a set of core competency questions from 20 domains present in the WDV dataset. To evaluate RevOnt, we contribute a new dataset of manually-annotated high-quality competency questions, and compare the extracted competency questions by calculating their BLEU score against the human references. The results for the abstraction and question generation components of the approach show good to high quality. Meanwhile, the accuracy of the filtering component is above 86%, which is comparable to the state-of-the-art classifications.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"82 ","pages":"Article 100822"},"PeriodicalIF":2.1,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000088/pdfft?md5=df0ecfc8d3506e224b7b22fbafe38dbf&pid=1-s2.0-S1570826824000088-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141058442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
João Gama , Rita P. Ribeiro , Saulo Mastelini , Narjes Davari , Bruno Veloso
{"title":"From fault detection to anomaly explanation: A case study on predictive maintenance","authors":"João Gama , Rita P. Ribeiro , Saulo Mastelini , Narjes Davari , Bruno Veloso","doi":"10.1016/j.websem.2024.100821","DOIUrl":"10.1016/j.websem.2024.100821","url":null,"abstract":"<div><p>Predictive Maintenance applications are increasingly complex, with interactions between many components. Black-box models are popular approaches based on deep-learning techniques due to their predictive accuracy. This paper proposes a neural-symbolic architecture that uses an online rule-learning algorithm to explain when the black-box model predicts failures. The proposed system solves two problems in parallel: (i) anomaly detection and (ii) explanation of the anomaly. For the first problem, we use an unsupervised state-of-the-art autoencoder. For the second problem, we train a rule learning system that learns a mapping from the input features to the autoencoder’s reconstruction error. Both systems run online and in parallel. The autoencoder signals an alarm for the examples with a reconstruction error that exceeds a threshold. The causes of the signal alarm are hard for humans to understand because they result from a non-linear combination of sensor data. The rule that triggers that example describes the relationship between the input features and the autoencoder’s reconstruction error. The rule explains the failure signal by indicating which sensors contribute to the alarm and allowing the identification of the component involved in the failure. The system can present global explanations for the black box model and local explanations for why the black box model predicts a failure. We evaluate the proposed system in a real-world case study of Metro do Porto and provide explanations that illustrate its benefits.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"81 ","pages":"Article 100821"},"PeriodicalIF":2.5,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000076/pdfft?md5=5ac7d7b9118cab57dfc9acc7b6e52d40&pid=1-s2.0-S1570826824000076-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141040939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenjun Liu , Hai Wang , Jieyang Wang , Huan Guo , Yuyan Sun , Mengshu Hou , Bao Yu , Hailan Wang , Qingcheng Peng , Chao Zhang , Cheng Liu
{"title":"A popular topic detection method based on microblog images and short text information","authors":"Wenjun Liu , Hai Wang , Jieyang Wang , Huan Guo , Yuyan Sun , Mengshu Hou , Bao Yu , Hailan Wang , Qingcheng Peng , Chao Zhang , Cheng Liu","doi":"10.1016/j.websem.2024.100820","DOIUrl":"10.1016/j.websem.2024.100820","url":null,"abstract":"<div><p>Popular topic detection is a topic identification by the information of documents posted by users in social networking platforms. In a large body of research literature, most popular topic detection methods identify the distribution of unknown topics by integrating information from documents based on social networking platforms. However, among these popular topic detection methods, most of them have a low accuracy in topic detection due to the short text content and the abundance of useless punctuation marks and emoticons. Image information in short texts has also been overlooked, while this information may contain the real topic matter of the user's posted content. In order to solve the above problems and improve the quality of topic detection, this paper proposes a popular topic detection method based on microblog images and short text information. The method uses an image description model to obtain more information about short texts, identifies hot words by a new word discovery algorithm in the preprocessing stage, and uses a PTM model to improve the quality and effectiveness of topic detection during topic detection and aggregation. The experimental results show that the topic detection method in this paper improves the values of evaluation indicators compared with the other three topic detection methods. In conclusion, the popular topic detection method proposed in this paper can improve the performance of topic detection by integrating microblog images and short text information, and outperforms other topic detection methods selected in this paper.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"81 ","pages":"Article 100820"},"PeriodicalIF":2.5,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000064/pdfft?md5=27a6b3b5059b99e5d02665a7a31e8e9d&pid=1-s2.0-S1570826824000064-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141034966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sayed Hoseini , Johannes Theissen-Lipp , Christoph Quix
{"title":"A survey on semantic data management as intersection of ontology-based data access, semantic modeling and data lakes","authors":"Sayed Hoseini , Johannes Theissen-Lipp , Christoph Quix","doi":"10.1016/j.websem.2024.100819","DOIUrl":"https://doi.org/10.1016/j.websem.2024.100819","url":null,"abstract":"<div><p>In recent years, data lakes emerged as a way to manage large amounts of heterogeneous data for modern data analytics. One way to prevent data lakes from turning into inoperable data swamps is semantic data management. Such approaches propose the linkage of metadata to knowledge graphs based on the Linked Data principles to provide more meaning and semantics to the data in the lake. Such a semantic layer may be utilized not only for data management but also to tackle the problem of data integration from heterogeneous sources, in order to make data access more expressive and interoperable. In this survey, we review recent approaches with a specific focus on the application within data lake systems and scalability to Big Data. We classify the approaches into (i) basic semantic data management, (ii) semantic modeling approaches for enriching metadata in data lakes, and (iii) methods for ontology-based data access. In each category, we cover the main techniques and their background, and compare latest research. Finally, we point out challenges for future work in this research area, which needs a closer integration of Big Data and Semantic Web technologies.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"81 ","pages":"Article 100819"},"PeriodicalIF":2.5,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000052/pdfft?md5=ba83860fb725179723385f42b29b9908&pid=1-s2.0-S1570826824000052-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140816991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}