A. Minutolo, A. Esposito, Mario Ciampi, M. Esposito, G. Cassetti
{"title":"An Automatic Method for Deriving OWL Ontologies from XML Documents","authors":"A. Minutolo, A. Esposito, Mario Ciampi, M. Esposito, G. Cassetti","doi":"10.1109/3PGCIC.2014.88","DOIUrl":null,"url":null,"abstract":"In the last decade, the field of Big Data Analytics has become increasingly important in both the academic and the business communities. Typically, data are mostly structured, collected by different actors through various heterogeneous and distributed information sources, and stored and managed often directly in XML. In order to enable large volume of data to be described in such a way that their meaning can be exploited by machines and, thus, semantic queries and automatic inferential procedures can be enabled, this paper presents an automatic method to derive OWL ontologies from XML schemas. The main contribution of this method relies on the possibility of producing a target ontology starting from multiple XML schemas, by discriminating between domain and cross-domain entities and, contextually, simplifying the overall structure of the final ontology generated, i.e. By eliminating not-used cross-domain entities. This method has been applied to a concrete application case in the healthcare domain, with the goal of generating an ontological model from the XML schemas implementing the HL7 Version 3 Clinical Document Architecture Release 2.","PeriodicalId":395610,"journal":{"name":"2014 Ninth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 Ninth International Conference on P2P, Parallel, Grid, Cloud and Internet Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/3PGCIC.2014.88","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In the last decade, the field of Big Data Analytics has become increasingly important in both the academic and the business communities. Typically, data are mostly structured, collected by different actors through various heterogeneous and distributed information sources, and stored and managed often directly in XML. In order to enable large volume of data to be described in such a way that their meaning can be exploited by machines and, thus, semantic queries and automatic inferential procedures can be enabled, this paper presents an automatic method to derive OWL ontologies from XML schemas. The main contribution of this method relies on the possibility of producing a target ontology starting from multiple XML schemas, by discriminating between domain and cross-domain entities and, contextually, simplifying the overall structure of the final ontology generated, i.e. By eliminating not-used cross-domain entities. This method has been applied to a concrete application case in the healthcare domain, with the goal of generating an ontological model from the XML schemas implementing the HL7 Version 3 Clinical Document Architecture Release 2.
在过去的十年中,大数据分析领域在学术界和商界都变得越来越重要。通常,数据大多是结构化的,由不同的参与者通过各种异构和分布式信息源收集,并且通常直接以XML存储和管理。为了使大量数据能够以机器可以利用其含义的方式进行描述,从而启用语义查询和自动推理过程,本文提出了一种从XML模式派生OWL本体的自动方法。该方法的主要贡献依赖于从多个XML模式开始生成目标本体的可能性,通过区分域和跨域实体,并在上下文中简化最终生成的本体的整体结构,即通过消除未使用的跨域实体。该方法已应用于医疗保健领域的一个具体应用案例,其目标是从实现HL7 Version 3临床文档体系结构第2版的XML模式生成本体模型。