Elisavet Koutsiana, Tushita Yadav, Nitisha Jain, Albert Meroño-Peñuela, Elena Simperl
{"title":"Agreeing and disagreeing in collaborative knowledge graph construction: An analysis of Wikidata","authors":"Elisavet Koutsiana, Tushita Yadav, Nitisha Jain, Albert Meroño-Peñuela, Elena Simperl","doi":"10.1016/j.websem.2025.100868","DOIUrl":"10.1016/j.websem.2025.100868","url":null,"abstract":"<div><div>In this work, we study disagreements in discussions around Wikidata, an online knowledge community that builds the data backend of Wikipedia. Discussions are essential in collaborative work as they can increase contributor performance and encourage the emergence of shared norms and practices. While disagreements can play a productive role in discussions, they can also lead to conflicts and controversies, which impact contributor’ well-being and their motivation to engage. We want to understand if and when such phenomena arise in Wikidata, using a mix of quantitative and qualitative analyses to identify the types of topics people disagree about, the most common patterns of interaction, and roles people play when arguing for or against an issue. We find that decisions to create Wikidata properties are much faster than those to delete properties and that more than half of controversial discussions do not lead to consensus. Our analysis suggests that Wikidata is an inclusive community, considering different opinions when making decisions, and that conflict and vandalism are rare in discussions. At the same time, while one-fourth of the editors participating in controversial discussions contribute legitimate and insightful opinions about Wikidata’s emerging issues, they respond with one or two posts and do not remain engaged in the discussions to reach consensus. Our work contributes to the analysis of collaborative KG construction with insights about communication and decision-making in projects, as well as with methodological directions and open datasets. We hope our findings will help managers and designers support community decision-making and improve discussion tools and practices.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"86 ","pages":"Article 100868"},"PeriodicalIF":2.1,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144262851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"What can knowledge graph do for few-shot named entity recognition","authors":"Binling Nie, Yiming Shao, Yigang Wang","doi":"10.1016/j.websem.2025.100866","DOIUrl":"10.1016/j.websem.2025.100866","url":null,"abstract":"<div><div>Due to its extensive applicability in various downstream domains, few-shot named entity recognition (NER) has attracted increasing attention, particularly in areas where acquiring sufficient labeled data poses a significant challenge. Recent studies have highlighted the potential of knowledge graphs (KGs) in enhancing natural language processing (NLP) tasks. However, a comprehensive understanding of whether and how KGs can effectively improve the NER performance under low-resource conditions remains elusive. In this paper, for the first time, we quantitatively investigate the effects of different kinds of extra KG features for few-shot NER. We enable our analysis by aggregating extra KG features into an NER framework. Through extensive experiments, we find that incorporating class features yields the best performance. To fully explore the potential of class features from KGs, we propose a novel network architecture, named KGen, to jointly leverage KG-based knowledge from both the input sentence side and the label semantic side for few-shot NER.The efficacy of our proposed method is validated through extensive experiments on five challenging datasets.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"86 ","pages":"Article 100866"},"PeriodicalIF":2.1,"publicationDate":"2025-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144134312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nadir Guetmi , Abdessamad Imine , Moulay Driss Mechaoui
{"title":"MobiRDF: A cloud-based collaborative editing service for mobile RDF data sharing","authors":"Nadir Guetmi , Abdessamad Imine , Moulay Driss Mechaoui","doi":"10.1016/j.websem.2025.100864","DOIUrl":"10.1016/j.websem.2025.100864","url":null,"abstract":"<div><div>In this paper, we present <span>MobiRDF</span>, a novel cloud-based approach designed for the efficient and scalable management of RDF data, enabling real-time sharing and editing. <span>MobiRDF</span> offers two main services: <em>(i) Partial Replication of RDF Graphs</em>: This service facilitates the selective replication of RDF graphs on mobile devices, addressing their inherent resource limitations. Our partial graph selector allows using only the useful data requested by the user from the RDF graph instead of storing the entire RDF graph, which enables efficient data storage and retrieval. <em>(ii) Collaboration Protocol</em>: This protocol provides synchronization mechanisms for collaborative work in a fully decentralized manner. It uses commutativity-based consistency model to maintain the consistency of the shared RDF graph, ensuring seamless collaboration among users. The heavier computational tasks, such as dynamic group management, synchronization merging, and reasoning processes, are managed in the Cloud, optimizing the performance of resource-constrained mobile devices. The key novelty of <span>MobiRDF</span> is its ability to ensure both syntactic and semantic consistency of shared RDF data, through reasoning processes using the Closed-World Assumption (CWA) for inferring new triples. Experimental evaluations show that <span>MobiRDF</span> is efficient in terms of network bandwidth and energy consumption, validating its effectiveness in real-world scenarios.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"86 ","pages":"Article 100864"},"PeriodicalIF":2.1,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143463946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two ontology design patterns in the domain of collections","authors":"Idoia Berges, Arantza Illarramendi","doi":"10.1016/j.websem.2025.100863","DOIUrl":"10.1016/j.websem.2025.100863","url":null,"abstract":"<div><div>Collections are objects used to arrange, into a single unit, multiple data items that form a natural group. Different types of collections exist, due to different constraints based on whether or not they impose an order on their elements and whether or not they allow repetition of elements. Any of them are easily found in several domains of our everyday life. For instance, a deck of cards, the prime divisors of a number or the teams that compete in a championship can be seen as a collection. Thus, an effective modeling of collections is a recurring issue in information management.</div><div>In the ontology design field, recurring modeling problems can be addressed by the use of Ontology Design Patterns (ODPs). In the case of collections, ODPs have been proposed for representing sequences, lists, sets and bags. However, none of these patterns are completely adequate for representing collections of ordered elements without repetition. In this paper we present an ODP for representing that notion, which we have named <em>Permutation</em>. Moreover, another ODP named <em>ListOfPermutations</em> is also introduced, which allows to represent how the order of a <em>Permutation</em> varies along time. Because not all constraints required by these ODPs can be represented in OWL 2, SHACL shapes have been used in their definitions.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"85 ","pages":"Article 100863"},"PeriodicalIF":2.1,"publicationDate":"2025-02-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143445381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accelerating knowledge graph and ontology engineering with large language models","authors":"Cogan Shimizu , Pascal Hitzler","doi":"10.1016/j.websem.2025.100862","DOIUrl":"10.1016/j.websem.2025.100862","url":null,"abstract":"<div><div>Large Language Models bear the promise of significant acceleration of key Knowledge Graph and Ontology Engineering tasks, including ontology modeling, extension, modification, population, alignment, as well as entity disambiguation. We lay out LLM-based Knowledge Graph and Ontology Engineering as a new and coming area of research, and argue that modular approaches to ontologies will be of central importance.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"85 ","pages":"Article 100862"},"PeriodicalIF":2.1,"publicationDate":"2025-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143419299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Logic Augmented Generation","authors":"Aldo Gangemi , Andrea Giovanni Nuzzolese","doi":"10.1016/j.websem.2024.100859","DOIUrl":"10.1016/j.websem.2024.100859","url":null,"abstract":"<div><div>Semantic Knowledge Graphs (SKG) face challenges with scalability, flexibility, contextual understanding, and handling unstructured or ambiguous information. However, they offer formal and structured knowledge enabling highly interpretable and reliable results by means of reasoning and querying. Large Language Models (LLMs) may overcome those limitations, making them suitable in open-ended tasks and unstructured environments. Nevertheless, LLMs are hardly interpretable and often unreliable. To take the best out of LLMs and SKGs, we envision Logic Augmented Generation (LAG) to combine the benefits of the two worlds. LAG uses LLMs as Reactive Continuous Knowledge Graphs that can generate potentially infinite relations and tacit knowledge on-demand. LAG uses SKGs to inject a discrete heuristic dimension with clear logical and factual boundaries. We exemplify LAG in two tasks of collective intelligence, i.e., medical diagnostics and climate projections. Understanding the properties and limitations of LAG, which are still mostly unknown, is of utmost importance for enabling a variety of tasks involving tacit knowledge in order to provide interpretable and effective results.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"85 ","pages":"Article 100859"},"PeriodicalIF":2.1,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143165568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Knowledge Graphs as a source of trust for LLM-powered enterprise question answering","authors":"Juan Sequeda, Dean Allemang, Bryon Jacob","doi":"10.1016/j.websem.2024.100858","DOIUrl":"10.1016/j.websem.2024.100858","url":null,"abstract":"<div><div>Generative AI provides an innovative and exciting way to manage knowledge and data at any scale; for small projects, at the enterprise level, and even at a world wide web scale. It is tempting to think that Generative AI has made other knowledge-based technologies obsolete; that anything we wanted to do with knowledge-based systems, Knowledge Graphs or even expert systems can instead be done with Generative AI. Our position is counter to that conclusion.</div><div>Our practical experience on implementing enterprise question answering systems using Generative AI has shown that Knowledge Graphs support this infrastructure in multiple ways: they provide a formal framework to evaluate the validity of a query generated by an LLM, serve as a foundation for explaining results, and offer access to governed and trusted data. In this position paper, we share our experience, present industry needs, and outline the opportunities for future research contributions.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"85 ","pages":"Article 100858"},"PeriodicalIF":2.1,"publicationDate":"2025-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143165567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The ESW of Wikidata: Exploratory search workflows on Knowledge Graphs","authors":"Matteo Lissandrini , Gianmarco Prando , Gianmaria Silvello","doi":"10.1016/j.websem.2024.100860","DOIUrl":"10.1016/j.websem.2024.100860","url":null,"abstract":"<div><div>Exploratory search on Knowledge Graphs (KGs) arises when a user needs to understand and extract insights from an unfamiliar KG. In these exploratory sessions, the users issue a series of queries to identify relevant portions of the KG that can answer their questions, with each query answer informing the formulation of the next query. Despite the widespread adoption of KGs, the needs of current KG exploration use cases are not well understood. This work presents the “Exploratory Search Workflows” (ESW) collection focusing on real-world exploration sessions of an open-domain KG, Wikidata, conducted by 57 M.Sc. Computer Engineering students in two advanced Graph Database course editions. This resource includes 234 real exploratory workflows, each containing an average of 45 SPARQL queries and reference workflows that serve as gold-standard solutions to the proposed tasks. The ESW collection is also available as an RDF graph and accessible via a public SPARQL endpoint. It allows for analysis of real user sessions, understanding query evolution and complexity, and serves as the first query benchmark for KG management systems for exploratory search.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"85 ","pages":"Article 100860"},"PeriodicalIF":2.1,"publicationDate":"2025-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143165570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Knowledge graph based entity selection framework for ad-hoc retrieval","authors":"Pankaj Singh , Plaban Kumar Bhowmick","doi":"10.1016/j.websem.2024.100848","DOIUrl":"10.1016/j.websem.2024.100848","url":null,"abstract":"<div><div>Recent entity-based retrieval models utilizing knowledge bases have shown significant improvement in ad-hoc retrieval. However, a lack of coherence between candidate entities can lead to query intent drift at retrieval time. To address this issue, we present an entity selection algorithm that utilizes a graph clustering framework to discover the semantics between entities and encompass the query with highly coherent entities accumulated from different resources, including knowledge bases, and pseudo-relevance feedback documents. Through this work, we propose: (1) An entity acquisition strategy to systematically acquire coherent entities for query expansion. (2) We propose a graph representation of entities to capture the coherence between entities where nodes correspond to the entities and edges represent semantic relatedness between entities. (3) We propose two different entity ranking approaches to select candidate entities based on the coherence with query entities and other coherent entities. A set of experiments on five TREC collections: ClueWeb09B, ClueWeb12B, Robust04, GOV2, and MS-Marco dataset under document retrieval task were conducted to verify the proposed algorithm’s performance. The reported results indicated that the proposed methodology outperforms existing state-of-the-art retrieval approaches in terms of MAP, NDCG, and P@20. The code and relevant data are available in <span><span>https://github.com/pankajkashyap65/KnowledgeGraph</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"84 ","pages":"Article 100848"},"PeriodicalIF":2.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leveraging Knowledge Graphs for AI System Auditing and Transparency","authors":"Laura Waltersdorfer , Marta Sabou","doi":"10.1016/j.websem.2024.100849","DOIUrl":"10.1016/j.websem.2024.100849","url":null,"abstract":"<div><div>Auditing complex Artificial Intelligence (AI) systems is gaining importance in light of new regulations and is particularly challenging in terms of system complexity, knowledge integration, and differing transparency needs. Current AI auditing tools however, lack semantic context, resulting in difficulties for auditors in effectively collecting and integrating, but also for analysing and querying audit data. In this position paper, we explore how Knowledge Graphs (KGs) can address these challenges by offering a structured and integrative approach to collecting and transforming audit traces. This work discusses the current limitations in both AI auditing processes and tools. Furthermore, we examine how KGs can play a transformative role in overcoming these obstacles to achieve improved auditability and transparency of AI systems.</div></div>","PeriodicalId":49951,"journal":{"name":"Journal of Web Semantics","volume":"84 ","pages":"Article 100849"},"PeriodicalIF":2.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}