Fiorela Ciroku , Jacopo de Berardinis , Jongmo Kim , Albert Meroño-Peñuela , Valentina Presutti , Elena Simperl
{"title":"RevOnt: Reverse engineering of competency questions from knowledge graphs via language models","authors":"Fiorela Ciroku , Jacopo de Berardinis , Jongmo Kim , Albert Meroño-Peñuela , Valentina Presutti , Elena Simperl","doi":"10.1016/j.websem.2024.100822","DOIUrl":"10.1016/j.websem.2024.100822","url":null,"abstract":"<div><p>The process of developing ontologies – a formal, explicit specification of a shared conceptualisation – is addressed by well-known methodologies. As for any engineering development, its fundamental basis is the collection of requirements, which includes the elicitation of competency questions. Competency questions are defined through interacting with domain and application experts or by investigating existing datasets that may be used to populate the ontology i.e. its knowledge graph. The rise in popularity and accessibility of knowledge graphs provides an opportunity to support this phase with automatic tools. In this work, we explore the possibility of extracting competency questions from a knowledge graph. This reverses the traditional workflow in which knowledge graphs are built from ontologies, which in turn are engineered from competency questions. We describe in detail RevOnt, an approach that extracts and abstracts triples from a knowledge graph, generates questions based on triple verbalisations, and filters the resulting questions to yield a meaningful set of competency questions; the WDV dataset. This approach is implemented utilising the Wikidata knowledge graph as a use case, and contributes a set of core competency questions from 20 domains present in the WDV dataset. To evaluate RevOnt, we contribute a new dataset of manually-annotated high-quality competency questions, and compare the extracted competency questions by calculating their BLEU score against the human references. The results for the abstraction and question generation components of the approach show good to high quality. Meanwhile, the accuracy of the filtering component is above 86%, which is comparable to the state-of-the-art classifications.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"82 ","pages":"Article 100822"},"PeriodicalIF":2.1,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000088/pdfft?md5=df0ecfc8d3506e224b7b22fbafe38dbf&pid=1-s2.0-S1570826824000088-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141058442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
João Gama , Rita P. Ribeiro , Saulo Mastelini , Narjes Davari , Bruno Veloso
{"title":"From fault detection to anomaly explanation: A case study on predictive maintenance","authors":"João Gama , Rita P. Ribeiro , Saulo Mastelini , Narjes Davari , Bruno Veloso","doi":"10.1016/j.websem.2024.100821","DOIUrl":"10.1016/j.websem.2024.100821","url":null,"abstract":"<div><p>Predictive Maintenance applications are increasingly complex, with interactions between many components. Black-box models are popular approaches based on deep-learning techniques due to their predictive accuracy. This paper proposes a neural-symbolic architecture that uses an online rule-learning algorithm to explain when the black-box model predicts failures. The proposed system solves two problems in parallel: (i) anomaly detection and (ii) explanation of the anomaly. For the first problem, we use an unsupervised state-of-the-art autoencoder. For the second problem, we train a rule learning system that learns a mapping from the input features to the autoencoder’s reconstruction error. Both systems run online and in parallel. The autoencoder signals an alarm for the examples with a reconstruction error that exceeds a threshold. The causes of the signal alarm are hard for humans to understand because they result from a non-linear combination of sensor data. The rule that triggers that example describes the relationship between the input features and the autoencoder’s reconstruction error. The rule explains the failure signal by indicating which sensors contribute to the alarm and allowing the identification of the component involved in the failure. The system can present global explanations for the black box model and local explanations for why the black box model predicts a failure. We evaluate the proposed system in a real-world case study of Metro do Porto and provide explanations that illustrate its benefits.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"81 ","pages":"Article 100821"},"PeriodicalIF":2.5,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000076/pdfft?md5=5ac7d7b9118cab57dfc9acc7b6e52d40&pid=1-s2.0-S1570826824000076-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141040939","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wenjun Liu , Hai Wang , Jieyang Wang , Huan Guo , Yuyan Sun , Mengshu Hou , Bao Yu , Hailan Wang , Qingcheng Peng , Chao Zhang , Cheng Liu
{"title":"A popular topic detection method based on microblog images and short text information","authors":"Wenjun Liu , Hai Wang , Jieyang Wang , Huan Guo , Yuyan Sun , Mengshu Hou , Bao Yu , Hailan Wang , Qingcheng Peng , Chao Zhang , Cheng Liu","doi":"10.1016/j.websem.2024.100820","DOIUrl":"10.1016/j.websem.2024.100820","url":null,"abstract":"<div><p>Popular topic detection is a topic identification by the information of documents posted by users in social networking platforms. In a large body of research literature, most popular topic detection methods identify the distribution of unknown topics by integrating information from documents based on social networking platforms. However, among these popular topic detection methods, most of them have a low accuracy in topic detection due to the short text content and the abundance of useless punctuation marks and emoticons. Image information in short texts has also been overlooked, while this information may contain the real topic matter of the user's posted content. In order to solve the above problems and improve the quality of topic detection, this paper proposes a popular topic detection method based on microblog images and short text information. The method uses an image description model to obtain more information about short texts, identifies hot words by a new word discovery algorithm in the preprocessing stage, and uses a PTM model to improve the quality and effectiveness of topic detection during topic detection and aggregation. The experimental results show that the topic detection method in this paper improves the values of evaluation indicators compared with the other three topic detection methods. In conclusion, the popular topic detection method proposed in this paper can improve the performance of topic detection by integrating microblog images and short text information, and outperforms other topic detection methods selected in this paper.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"81 ","pages":"Article 100820"},"PeriodicalIF":2.5,"publicationDate":"2024-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000064/pdfft?md5=27a6b3b5059b99e5d02665a7a31e8e9d&pid=1-s2.0-S1570826824000064-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141034966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sayed Hoseini , Johannes Theissen-Lipp , Christoph Quix
{"title":"A survey on semantic data management as intersection of ontology-based data access, semantic modeling and data lakes","authors":"Sayed Hoseini , Johannes Theissen-Lipp , Christoph Quix","doi":"10.1016/j.websem.2024.100819","DOIUrl":"https://doi.org/10.1016/j.websem.2024.100819","url":null,"abstract":"<div><p>In recent years, data lakes emerged as a way to manage large amounts of heterogeneous data for modern data analytics. One way to prevent data lakes from turning into inoperable data swamps is semantic data management. Such approaches propose the linkage of metadata to knowledge graphs based on the Linked Data principles to provide more meaning and semantics to the data in the lake. Such a semantic layer may be utilized not only for data management but also to tackle the problem of data integration from heterogeneous sources, in order to make data access more expressive and interoperable. In this survey, we review recent approaches with a specific focus on the application within data lake systems and scalability to Big Data. We classify the approaches into (i) basic semantic data management, (ii) semantic modeling approaches for enriching metadata in data lakes, and (iii) methods for ontology-based data access. In each category, we cover the main techniques and their background, and compare latest research. Finally, we point out challenges for future work in this research area, which needs a closer integration of Big Data and Semantic Web technologies.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"81 ","pages":"Article 100819"},"PeriodicalIF":2.5,"publicationDate":"2024-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000052/pdfft?md5=ba83860fb725179723385f42b29b9908&pid=1-s2.0-S1570826824000052-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140816991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marios Papamichalopoulos , George Papadakis , George Mandilaras , Maria Siampou , Nikos Mamoulis , Manolis Koubarakis
{"title":"Three-dimensional Geospatial Interlinking with JedAI-spatial","authors":"Marios Papamichalopoulos , George Papadakis , George Mandilaras , Maria Siampou , Nikos Mamoulis , Manolis Koubarakis","doi":"10.1016/j.websem.2024.100817","DOIUrl":"https://doi.org/10.1016/j.websem.2024.100817","url":null,"abstract":"<div><p>Geospatial data constitutes a considerable part of Semantic Web data, but so far, its sources are inadequately interlinked in the Linked Open Data cloud. Geospatial Interlinking aims to cover this gap by associating geometries with topological relations like those of the Dimensionally Extended 9-Intersection Model. Due to its quadratic time complexity, various algorithms aim to carry out Geospatial Interlinking efficiently. We present <em>JedAI-spatial</em>, a novel, open-source system that organizes these algorithms according to three dimensions: (i) <em>Space Tiling</em>, which determines the approach that reduces the search space, (ii) <em>Budget-awareness</em>, which distinguishes interlinking algorithms into batch and progressive ones, and (iii) <em>Execution mode</em>, which discerns between serial algorithms, running on a single CPU-core, and parallel ones, running on top of Apache Spark. We analytically describe JedAI-spatial’s architecture and capabilities and perform thorough experiments to provide interesting insights about the relative performance of its algorithms.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"81 ","pages":"Article 100817"},"PeriodicalIF":2.5,"publicationDate":"2024-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000039/pdfft?md5=59ac5500aad18c0d78d47b866d6b2073&pid=1-s2.0-S1570826824000039-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140549558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Extraction of object-action and object-state associations from Knowledge Graphs","authors":"Alexandros Vassiliades , Theodore Patkos , Vasilis Efthymiou , Antonis Bikakis , Nick Bassiliades , Dimitris Plexousakis","doi":"10.1016/j.websem.2024.100816","DOIUrl":"10.1016/j.websem.2024.100816","url":null,"abstract":"<div><p>Infusing autonomous artificial systems with knowledge about the physical world they inhabit is a critical and long-held aim for the Artificial Intelligence community. Training systems with relevant data is a typical approach; however, finding the data required is not always possible, especially when much of this knowledge is commonsense. In this paper, we present a comparison of topology-based and semantics-based methods for extracting information about object-action and object-state association relations from knowledge graphs, such as ConceptNet, WordNet, ATOMIC, YAGO, WebChild and DBpedia. Moreover, we propose a novel method for extracting information about object-action and object-state associations from knowledge graphs. Our method is composed of a set of techniques for locating, enriching, evaluating, cleaning and exposing knowledge from such resources, relying on semantic similarity methods. Some important aspects of our method are the flexibility in deciding how to deal with the noise that exists in the data, and the capability to determine the importance of a path through training, rather than through manual annotation.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"81 ","pages":"Article 100816"},"PeriodicalIF":2.5,"publicationDate":"2024-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000027/pdfft?md5=ffd3cef20c3db3c0e3c77665c129fe41&pid=1-s2.0-S1570826824000027-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140182110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Andreas Eibeck , Shaocong Zhang , Mei Qi Lim , Markus Kraft
{"title":"A simple and efficient approach to unsupervised instance matching and its application to linked data of power plants","authors":"Andreas Eibeck , Shaocong Zhang , Mei Qi Lim , Markus Kraft","doi":"10.1016/j.websem.2024.100815","DOIUrl":"10.1016/j.websem.2024.100815","url":null,"abstract":"<div><p>Knowledge graphs store and link semantically annotated data about real-world entities from a variety of domains and on a large scale. The World Avatar is based on a dynamic decentralised knowledge graph and on semantic technologies to realise complex cross-domain scenarios. Accurate computational results for such scenarios require the availability of complete, high-quality data. This work focuses on instance matching — one of the subtasks of automatically populating the knowledge graph with data from a wide spectrum of external sources. Instance matching compares two data sets and seeks to identify instances (data, records) referring to the same real-world entity. We introduce AutoCal, a new instance matcher which does not require labelled data and runs out of the box for a wide range of domains without tuning method-specific parameters. AutoCal achieves results competitive to recently proposed unsupervised matchers from the field of Machine Learning. We also select an unsupervised state-of-the-art matcher from the field of Deep Learning for a thorough comparison. Our results show that neither AutoCal nor the state-of-the-art matcher is superior regarding matching quality while AutoCal has only moderate hardware requirements and runs 2.7 to 60 times faster. In summary, AutoCal is specifically well-suited to be used in an automated environment. We present its prototypical integration into the World Avatar and apply AutoCal to the domain of power plants which is relevant for practical environmental scenarios of the World Avatar.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"80 ","pages":"Article 100815"},"PeriodicalIF":2.5,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1570826824000015/pdfft?md5=3ea0d1c12ee82e1292dd9975673bdbcc&pid=1-s2.0-S1570826824000015-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139918083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FIDES: An ontology-based approach for making machine learning systems accountable","authors":"Izaskun Fernandez , Cristina Aceta , Eduardo Gilabert , Iker Esnaola-Gonzalez","doi":"10.1016/j.websem.2023.100808","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100808","url":null,"abstract":"<div><p>Although the maturity of technologies based on Artificial Intelligence (AI) is rather advanced nowadays, their adoption, deployment and application are not as wide as it could be expected. This could be attributed to many barriers, among which the lack of trust of users stands out. Accountability is a relevant factor to progress in this trustworthiness aspect, as it allows to determine the causes that derived a given decision or suggestion made by an AI system. This article focuses on the accountability of a specific branch of AI, statistical machine learning (ML), based on a semantic approach. FIDES, an ontology-based approach towards achieving the accountability of ML systems is presented, where all the relevant information related to a ML-based model is semantically annotated, from the dataset and model parametrisation to deployment aspects, to be exploited later to answer issues related to reproducibility, replicability, definitely, accountability. The feasibility of the proposed approach has been demonstrated in two scenarios, real-world energy efficiency and manufacturing, and it is expected to pave the way towards raising awareness about the potential of Semantic Technologies in different factors that may be key in the trustworthiness of AI-based systems.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"79 ","pages":"Article 100808"},"PeriodicalIF":2.5,"publicationDate":"2023-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138087525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Klevis Shkembi , Petar Kochovski , Thanasis G. Papaioannou , Caroline Barelle , Vlado Stankovski
{"title":"Semantic Web and blockchain technologies: Convergence, challenges and research trends","authors":"Klevis Shkembi , Petar Kochovski , Thanasis G. Papaioannou , Caroline Barelle , Vlado Stankovski","doi":"10.1016/j.websem.2023.100809","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100809","url":null,"abstract":"<div><p>In recent years, on the one hand, we have witnessed the rise of blockchain technology, which has led to better transparency, traceability, and therefore, trustworthy exchange of digital assets among different actors. On the other hand, achieving trustworthy content exchange has been one of the primary objectives of the Semantic Web, part of the World Wide Web Consortium. Semantic Web and blockchain technologies are the fundamental building blocks of Web3 (the third version of the Internet), which aims to link data through a decentralized approach. Blockchain provides a decentralized and secure framework for users to safeguard their data and take control over their data and Web3 experiences. However, developing trustworthy decentralized applications (Dapps) is a challenge because many blockchain-based functionalities must be developed from scratch, and combined with data semantics to open new innovative opportunities. In this survey paper, we explore the cross-cutting domain of the Semantic Web and blockchain and identify the critical building blocks required to achieve trust in the Next-Generation Internet. The application domains that could benefit from these technologies are also investigated. We developed a deep analysis of the published literature between 2015 and 2023. We performed our analysis in different digital libraries (e.g., Elsevier, IEEE, ACM), and as a result of our research, we retrieved 137 papers, of which 97 were retrieved as relevant to include in the paper. Furthermore, we studied several aspects (e.g., network type, transactions per second) of existing blockchain platforms. Semantic Web and blockchain technologies can be used to realize a verification and certification process for data quality. Examples of mechanisms to achieve this are the Decentralized Identities of the Semantic Web or the various blockchain consensus protocols that help achieve decentralization and realize democratic principles. Therefore, Semantic Web and blockchain technologies should be combined to achieve trust in the highly decentralized, semantically complex, and dynamic environments needed to build smart applications of the future.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"79 ","pages":"Article 100809"},"PeriodicalIF":2.5,"publicationDate":"2023-11-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138087526","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhigang Hao , Wolfgang Mayer , Jingbo Xia , Guoliang Li , Li Qin , Zaiwen Feng
{"title":"Ontology alignment with semantic and structural embeddings","authors":"Zhigang Hao , Wolfgang Mayer , Jingbo Xia , Guoliang Li , Li Qin , Zaiwen Feng","doi":"10.1016/j.websem.2023.100798","DOIUrl":"https://doi.org/10.1016/j.websem.2023.100798","url":null,"abstract":"<div><p><span><span>Ontology alignment is essential for data integration and interoperability across multiple applications across diverse disciplines. In recent decades, significant advancements have been made in the development of advanced methods and systems for ontology alignment. Empirical results have suggested that </span>ontological semantics can be effectively employed to enhance the alignment process. Besides, structural information is crucial for ontology alignment as it reflects the relations among adjacent concepts in the ontology. Previous works are mainly based on external lexicon and </span>predefined rules<span> based on ontological structure<span>. Recently, deep learning has imposed positive impacts on ontology alignment and obtained substantial improvement.</span></span></p><p><span>This paper proposes a new method based on ontology embedding incorporating the semantic and structural features. It utilizes the distance between the embedding of two ontological concepts to be aligned as the criterion for alignment. The proposed method is used to align two widely used food ontologies and three Chinese food classification ontologies. The experimental results show that our method enhances the performance compared to several state-of-the-art alignment systems, demonstrating the importance of learning semantic representation and structural representation. Furthermore, the proposed method is evaluated on several different tracks of the Ontology Alignment Evaluation Initiative (OAEI), and experimental results show that our method outperforms other baselines in effectiveness. The data and code can be obtained from: </span><span>https://github.com/haozhigang1111/Ontology-Alignment.git</span><svg><path></path></svg>.</p></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"78 ","pages":"Article 100798"},"PeriodicalIF":2.5,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49881572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}