{"title":"On-line Versioned Schema Inference for Large Semantic Web Data Sources","authors":"Kenza Kellou-Menouer, Zoubida Kedad","doi":"10.1145/3085504.3085513","DOIUrl":null,"url":null,"abstract":"A growing number of data sources expressed in RDF(S)/OWL are available on the Web. They are increasingly used in big data and real-time applications. These data sources may be created without formally defining their schema, which is implicit in the stored data. The instances of a source do not have to conform to the schema when it is defined. This offers more flexibility and eases data evolution. However, it comes at the cost of losing the description of the data, which can be useful in many contexts. In this paper, we present SchemaDecrypt, a novel approach for discovering a versioned schema for a remote data source. SchemaDecrypt enables the discovery of the different structures of the existing classes. Our approach discovers the versions on-line, without uploading or browsing the data source. It enables to overcome the source querying restrictions and the combinatorial explosion of the candidate versions. We present some experimental evaluations on DBpedia to demonstrate the performances of our approach.","PeriodicalId":431308,"journal":{"name":"Proceedings of the 29th International Conference on Scientific and Statistical Database Management","volume":"59 3","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 29th International Conference on Scientific and Statistical Database Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3085504.3085513","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
A growing number of data sources expressed in RDF(S)/OWL are available on the Web. They are increasingly used in big data and real-time applications. These data sources may be created without formally defining their schema, which is implicit in the stored data. The instances of a source do not have to conform to the schema when it is defined. This offers more flexibility and eases data evolution. However, it comes at the cost of losing the description of the data, which can be useful in many contexts. In this paper, we present SchemaDecrypt, a novel approach for discovering a versioned schema for a remote data source. SchemaDecrypt enables the discovery of the different structures of the existing classes. Our approach discovers the versions on-line, without uploading or browsing the data source. It enables to overcome the source querying restrictions and the combinatorial explosion of the candidate versions. We present some experimental evaluations on DBpedia to demonstrate the performances of our approach.