Semantic Web最新文献

筛选
英文 中文
Can you trust Wikidata? 维基数据可信吗?
IF 3 3区 计算机科学
Semantic Web Pub Date : 2024-03-07 DOI: 10.3233/sw-243577
V. Santos, Daniel Schwabe, Sérgio Lifschitz
{"title":"Can you trust Wikidata?","authors":"V. Santos, Daniel Schwabe, Sérgio Lifschitz","doi":"10.3233/sw-243577","DOIUrl":"https://doi.org/10.3233/sw-243577","url":null,"abstract":"In order to use a value retrieved from a Knowledge Graph (KG) for some computation, the user should, in principle, ensure that s/he trusts the veracity of the claim, i.e., considers the statement as a fact. Crowd-sourced KGs, or KGs constructed by integrating several different information sources of varying quality, must be used via a trust layer. The veracity of each claim in the underlying KG should be evaluated, considering what is relevant to carrying out some action that motivates the information seeking. The present work aims to assess how well Wikidata (WD) supports the trust decision process implied when using its data. WD provides several mechanisms that can support this trust decision, and our KG Profiling, based on WD claims and schema, elaborates an analysis of how multiple points of view, controversies, and potentially incomplete or incongruent content are presented and represented.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140260394","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evidence of large-scale conceptual disarray in multi-level taxonomies in Wikidata 维基数据多级分类法中大规模概念混乱的证据
IF 3 3区 计算机科学
Semantic Web Pub Date : 2024-03-07 DOI: 10.3233/sw-243562
Atílio A. Dadalto, João Paulo A. Almeida, Claudenir M. Fonseca, Giancarlo Guizzardi
{"title":"Evidence of large-scale conceptual disarray in multi-level taxonomies in Wikidata","authors":"Atílio A. Dadalto, João Paulo A. Almeida, Claudenir M. Fonseca, Giancarlo Guizzardi","doi":"10.3233/sw-243562","DOIUrl":"https://doi.org/10.3233/sw-243562","url":null,"abstract":"The distinction between types and individuals is key to most conceptual modeling techniques and knowledge representation languages. Despite that, there are a number of situations in which modelers navigate this distinction inadequately, leading to problematic models. We show evidence of a large number of representation mistakes associated with the failure to employ this distinction in the Wikidata knowledge graph, which can be identified with the incorrect use of instantiation, which is a relation between an instance and a type, and specialization (or subtyping), which is a relation between two types. The prevalence of the problems in Wikidata’s taxonomies suggests that methodological and computational tools are required to mitigate the issues identified, which occur in many settings when individuals, types, and their metatypes are included in the domain of interest. We conduct a conceptual analysis of entities involved in recurrent erroneous cases identified in this empirical data, and present a tool that supports users in identifying some of these mistakes.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140259605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ontology of active and passive environmental exposure 主动和被动环境暴露本体论
IF 3 3区 计算机科学
Semantic Web Pub Date : 2024-03-01 DOI: 10.3233/sw-243546
Csilla Vámos, Simon Scheider, Tabea Sonnenschein, R. Vermeulen
{"title":"Ontology of active and passive environmental exposure","authors":"Csilla Vámos, Simon Scheider, Tabea Sonnenschein, R. Vermeulen","doi":"10.3233/sw-243546","DOIUrl":"https://doi.org/10.3233/sw-243546","url":null,"abstract":"Exposure is a central concept of the health and behavioural sciences needed to study the influence of the environment on the health and behaviour of people within a spatial context. While an increasing number of studies measure different forms of exposure, including the influence of air quality, noise, and crime, the influence of land cover on physical activity, or of the urban environment on food intake, we lack a common conceptual model of environmental exposure that captures its main structure across all this variety. Against the background of such a model, it becomes possible not only to systematically compare different methodological approaches but also to better link and align the content of the vast amount of scientific publications on this topic in a systematic way. For example, an important methodical distinction is between studies that model exposure as an exclusive outcome of some activity versus ones where the environment acts as a direct independent cause (active vs. passive exposure). Here, we propose an information ontology design pattern that can be used to define exposure and to model its variants. It is built around causal relations between concepts including persons, activities, concentrations, exposures, environments and health risks. We formally define environmental stressors and variants of exposure using Description Logic (DL), which allows automatic inference from the RDF-encoded content of a paper. Furthermore, concepts can be linked with data models and modelling methods used in a study. To test the pattern, we translated competency questions into SPARQL queries and ran them over RDF-encoded content. Results show how study characteristics can be classified and summarized in a manner that reflects important methodical differences.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140083452","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TermIt: Managing normative thesauri TermIt:管理规范词库
IF 3 3区 计算机科学
Semantic Web Pub Date : 2024-02-28 DOI: 10.3233/sw-243547
P. Kremen, Michal Med, Miroslav Blasko, Lama Saeeda, Martin Ledvinka, Alan Buzek
{"title":"TermIt: Managing normative thesauri","authors":"P. Kremen, Michal Med, Miroslav Blasko, Lama Saeeda, Martin Ledvinka, Alan Buzek","doi":"10.3233/sw-243547","DOIUrl":"https://doi.org/10.3233/sw-243547","url":null,"abstract":"Thesauri are popular, as they represent a manageable compromise – they are well-understood by domain experts, yet formal enough to boost use cases like semantic search. Still, as the thesauri size and complexity grow in a domain, proper tracking of the concept references to their definitions in normative documents, interlinking concepts defined in different documents, and keeping all the concepts semantically consistent and ready for subsequent conceptual modeling, is difficult and requires adequate tool support. We present TermIt, a web-based thesauri manager aimed at supporting the creation of thesauri based on decrees, directives, standards, and other normative documents. In addition to common editing capabilities, TermIt offers term extraction from documents, including a web document annotation browser plug-in, tracking term definitions in documents, term quality and ontological correctness checking, community discussions over term meanings, and seamless interlinking of concepts across different thesauri. We also show that TermIt features better fit the E-government scenarios in the Czech Republic than other tools. Additionally, we present the feasibility of TermIt for these scenarios by preliminary user experience evaluation.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140421821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
How to create and use a national cross-domain ontology and data infrastructure on the Semantic Web 如何在语义网上创建和使用国家跨领域本体论和数据基础设施
IF 3 3区 计算机科学
Semantic Web Pub Date : 2024-02-23 DOI: 10.3233/sw-243468
E. Hyvönen
{"title":"How to create and use a national cross-domain ontology and data infrastructure on the Semantic Web","authors":"E. Hyvönen","doi":"10.3233/sw-243468","DOIUrl":"https://doi.org/10.3233/sw-243468","url":null,"abstract":"This paper presents a model and lessons learned for creating a cross-domain national ontology and Linked (Open) Data (LOD) infrastructure. The idea is to extend the global, domain agnostic “layer cake model” underlying the Semantic Web with domain specific and local features needed in applications. To test and demonstrate the infrastructure, a series of LOD services and portals in use have been created in 2002–2023 that cover a wide range of application domains. They have attracted millions of users in total suggesting feasibility of the proposed model. This line of research and development is unique due to its systematic national level nature and long time span of over twenty years.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140436439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing Data Use Ontology (DUO) for health-data sharing by extending it with ODRL and DPV 用 ODRL 和 DPV 扩展数据使用本体(DUO),加强健康数据共享
IF 3 3区 计算机科学
Semantic Web Pub Date : 2024-02-14 DOI: 10.3233/sw-243583
H. Pandit, Beatriz Esteves
{"title":"Enhancing Data Use Ontology (DUO) for health-data sharing by extending it with ODRL and DPV","authors":"H. Pandit, Beatriz Esteves","doi":"10.3233/sw-243583","DOIUrl":"https://doi.org/10.3233/sw-243583","url":null,"abstract":"The Global Alliance for Genomics and Health is an international consortium that is developing the Data Use Ontology (DUO) as a standard providing machine-readable codes for automation in data discovery and responsible sharing of genomics data. DUO concepts, which are encoded using OWL, only contain the textual descriptions of the conditions for data use they represent, and do not specify the intended permissions, prohibitions, and obligations explicitly – which limits their usefulness. We present an exploration of how the Open Digital Rights Language (ODRL) can be used to explicitly represent the information inherent in DUO concepts to create policies that are then used to represent conditions under which datasets are available for use, conditions in requests to use them, and to generate agreements based on a compatibility matching between the two. We also address a current limitation of DUO regarding specifying information relevant to privacy and data protection law by using the Data Privacy Vocabulary (DPV) which supports expressing legal concepts in a jurisdiction-agnostic manner as well as for specific laws like the GDPR. Our work supports the existing socio-technical governance processes involving use of DUO by providing a complementary rather than replacement approach. To support this and improve DUO, we provide a description of how our system can be deployed with a proof of concept demonstration that uses ODRL rules for all DUO concepts, and uses them to generate agreements through matching of requests to data offers. All resources described in this article are available at: https://w3id.org/duodrl/repo.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-02-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140457110","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The RDF2vec family of knowledge graph embedding methods 知识图谱嵌入方法的 RDF2vec 系列
IF 3 3区 计算机科学
Semantic Web Pub Date : 2024-01-25 DOI: 10.3233/sw-233514
Jan Portisch, Heiko Paulheim
{"title":"The RDF2vec family of knowledge graph embedding methods","authors":"Jan Portisch, Heiko Paulheim","doi":"10.3233/sw-233514","DOIUrl":"https://doi.org/10.3233/sw-233514","url":null,"abstract":"Knowledge graph embeddings represent a group of machine learning techniques which project entities and relations of a knowledge graph to continuous vector spaces. RDF2vec is a scalable embedding approach rooted in the combination of random walks with a language model. It has been successfully used in various applications. Recently, multiple variants to the RDF2vec approach have been proposed, introducing variations both on the walk generation and on the language modeling side. The combination of those different approaches has lead to an increasing family of RDF2vec variants. In this paper, we evaluate a total of twelve RDF2vec variants on a comprehensive set of benchmark models, and compare them to seven existing knowledge graph embedding methods from the family of link prediction approaches. Besides the established GEval benchmark introducing various downstream machine learning tasks on the DBpedia knowledge graph, we also use the new DLCC (Description Logic Class Constructors) benchmark consisting of two gold standards, one based on DBpedia, and one based on synthetically generated graphs. The latter allows for analyzing which ontological patterns in a knowledge graph can actually be learned by different embedding. With this evaluation, we observe that certain tailored RDF2vec variants can lead to improved performance on different downstream tasks, given the nature of the underlying problem, and that they, in particular, have a different behavior in modeling similarity and relatedness. The findings can be used to provide guidance in selecting a particular RDF2vec method for a given task.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139598053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
CANARD: An approach for generating expressive correspondences based on competency questions for alignment CANARD:基于能力问题生成表达性对应关系的方法,用于对齐
IF 3 3区 计算机科学
Semantic Web Pub Date : 2024-01-24 DOI: 10.3233/sw-233521
Élodie Thiéblin, Guilherme Sousa, Ollivier Haemmerlé, C. Trojahn
{"title":"CANARD: An approach for generating expressive correspondences based on competency questions for alignment","authors":"Élodie Thiéblin, Guilherme Sousa, Ollivier Haemmerlé, C. Trojahn","doi":"10.3233/sw-233521","DOIUrl":"https://doi.org/10.3233/sw-233521","url":null,"abstract":"Ontology matching aims at making ontologies interoperable. While the field has fully developed in the last years, most approaches are still limited to the generation of simple correspondences. More expressiveness is, however, required to better address the different kinds of ontology heterogeneities. This paper presents CANARD (Complex Alignment Need and A-box based Relation Discovery), an approach for generating expressive correspondences that rely on the notion of competency questions for alignment (CQA). A CQA expresses the user knowledge needs in terms of alignment and aims at reducing the alignment space. The approach takes as input a set of CQAs as SPARQL queries over the source ontology. The generation of correspondences is performed by matching the subgraph from the source CQA to the similar surroundings of the instances from the target ontology. Evaluation is carried out on both synthetic and real-world datasets. The impact of several approach parameters is discussed. Experiments have showed that CANARD performs, overall, better on CQA coverage than precision and that using existing same:As links, between the instances of the source and target ontologies, gives better results than exact label matches of their labels. The use of CQA improved also both CQA coverage and precision with respect to using automatically generated queries. The reassessment of the counter-example increased significantly the precision, to the detriment of runtime. Finally, experiments on large datasets showed that CANARD is one of the few systems that can perform on large knowledge bases, but depends on regularly populated knowledge bases and the quality of instance links.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139599754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OBO Foundry Food Ontology Interconnectivity OBO Foundry 食品本体互联互通
IF 3 3区 计算机科学
Semantic Web Pub Date : 2024-01-16 DOI: 10.3233/sw-233458
Damion M. Dooley, Liliana Andrés-Hernández, Georgeta Bordea, Leigh Carmody, D. Cavalieri, L. Chan, Pol Castellano-Escuder, C. Lachat, Fleur Mougin, F. Vitali, Chen Yang, Magalie Weber, Matthew Lange
{"title":"OBO Foundry Food Ontology Interconnectivity","authors":"Damion M. Dooley, Liliana Andrés-Hernández, Georgeta Bordea, Leigh Carmody, D. Cavalieri, L. Chan, Pol Castellano-Escuder, C. Lachat, Fleur Mougin, F. Vitali, Chen Yang, Magalie Weber, Matthew Lange","doi":"10.3233/sw-233458","DOIUrl":"https://doi.org/10.3233/sw-233458","url":null,"abstract":"Since its creation in 2016, the FoodOn food ontology has become an interconnected partner in various academic and government projects that span agricultural and public health domains. This paper examines recent data interoperability capabilities arising from food-related ontologies belonging to, or compatible with, the encyclopedic Open Biological and Biomedical Ontology Foundry (OBO) ontology platform, and how research organizations and industry might utilize them for their own projects or for data exchange. Projects are seeking standardized vocabulary across many food supply activities ranging from agricultural production, harvesting, preparation, food processing, marketing, distribution and consumption, as well as more indirect health, economic, food security and sustainability analysis and reporting tools. To satisfy this demand for controlled vocabulary requires establishing domain specific ontologies whose curators coordinate closely to produce recommended patterns for food system vocabulary.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2024-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139618518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Wikidata subsetting: Approaches, tools, and evaluation 维基数据子集:方法、工具和评估
IF 3 3区 计算机科学
Semantic Web Pub Date : 2023-12-27 DOI: 10.3233/sw-233491
Seyed Amir Hosseini Beghaeiraveri, J. E. Labra Gayo, A. Waagmeester, Ammar Ammar, Carolina Gonzalez, D. Slenter, Sabah Ul-Hasan, E. Willighagen, Fiona McNeill, A. Gray
{"title":"Wikidata subsetting: Approaches, tools, and evaluation","authors":"Seyed Amir Hosseini Beghaeiraveri, J. E. Labra Gayo, A. Waagmeester, Ammar Ammar, Carolina Gonzalez, D. Slenter, Sabah Ul-Hasan, E. Willighagen, Fiona McNeill, A. Gray","doi":"10.3233/sw-233491","DOIUrl":"https://doi.org/10.3233/sw-233491","url":null,"abstract":"Wikidata is a massive Knowledge Graph (KG), including more than 100 million data items and nearly 1.5 billion statements, covering a wide range of topics such as geography, history, scholarly articles, and life science data. The large volume of Wikidata is difficult to handle for research purposes; many researchers cannot afford the costs of hosting 100 GB of data. While Wikidata provides a public SPARQL endpoint, it can only be used for short-running queries. Often, researchers only require a limited range of data from Wikidata focusing on a particular topic for their use case. Subsetting is the process of defining and extracting the required data range from the KG; this process has received increasing attention in recent years. Specific tools and several approaches have been developed for subsetting, which have not been evaluated yet. In this paper, we survey the available subsetting approaches, introducing their general strengths and weaknesses, and evaluate four practical tools specific for Wikidata subsetting – WDSub, KGTK, WDumper, and WDF – in terms of execution performance, extraction accuracy, and flexibility in defining the subsets. Results show that all four tools have a minimum of 99.96% accuracy in extracting defined items and 99.25% in extracting statements. The fastest tool in extraction is WDF, while the most flexible tool is WDSub. During the experiments, multiple subset use cases have been defined and the extracted subsets have been analyzed, obtaining valuable information about the variety and quality of Wikidata, which would otherwise not be possible through the public Wikidata SPARQL endpoint.","PeriodicalId":48694,"journal":{"name":"Semantic Web","volume":null,"pages":null},"PeriodicalIF":3.0,"publicationDate":"2023-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139153989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信