Carlos J. Fernández Candel, Jesús J. García-Molina, Diego Sevilla Ruiz
{"title":"SkiQL: A unified schema query language","authors":"Carlos J. Fernández Candel, Jesús J. García-Molina, Diego Sevilla Ruiz","doi":"10.1016/j.datak.2023.102234","DOIUrl":null,"url":null,"abstract":"<div><p>Most NoSQL systems are schema-on-read: data can be stored without first having to declare a schema that imposes a structure. This schemaless feature offers flexibility to evolve data-intensive applications when data change frequently. However, freeing from declaring schemas does not mean their absence, but rather that they are implicit in data and code. Therefore, diagramming tools similar to those available for relational systems are also needed to help developers and administrators to design and to understand NoSQL schemas.</p><p>Visualizing diagrams is not practical if schemas contain hundreds of database entities, so exploration or query facilities are then needed. In schemaless NoSQL stores, data of the same entity can be stored with different structure (e.g., non-uniform types and optional fields), which can increase the difficulty of having readable diagrams.</p><p>NoSQL schema management tools should therefore have three main components: schema extraction, schema visualization, and schema query. As there are four main NoSQL data models, it is convenient for such tools to be built on a generic data model so that they provide platform-independence (of data models and data stores) to query and visualize schemas. With the aim of favoring the creation of generic database tools, the authors of this paper defined the U-Schema unified data model that integrates the four main NoSQL data models as well as the relational model.</p><p>This paper is focused on querying NoSQL and relational schemas which are represented as U-Schema models. We present the SkiQL language designed on U-Schema to achieve a platform-independent schema query service. SkiQL provides two constructs: schema-query and relationship-query. The former allows to obtain information of entity or relationship types, and the latter that of the aggregations or references (relations among types). We will show how SkiQL was evaluated by calculating well-known metrics for languages as well as using a survey with developers with experience in NoSQL.</p></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"148 ","pages":"Article 102234"},"PeriodicalIF":2.7000,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X23000940","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Most NoSQL systems are schema-on-read: data can be stored without first having to declare a schema that imposes a structure. This schemaless feature offers flexibility to evolve data-intensive applications when data change frequently. However, freeing from declaring schemas does not mean their absence, but rather that they are implicit in data and code. Therefore, diagramming tools similar to those available for relational systems are also needed to help developers and administrators to design and to understand NoSQL schemas.
Visualizing diagrams is not practical if schemas contain hundreds of database entities, so exploration or query facilities are then needed. In schemaless NoSQL stores, data of the same entity can be stored with different structure (e.g., non-uniform types and optional fields), which can increase the difficulty of having readable diagrams.
NoSQL schema management tools should therefore have three main components: schema extraction, schema visualization, and schema query. As there are four main NoSQL data models, it is convenient for such tools to be built on a generic data model so that they provide platform-independence (of data models and data stores) to query and visualize schemas. With the aim of favoring the creation of generic database tools, the authors of this paper defined the U-Schema unified data model that integrates the four main NoSQL data models as well as the relational model.
This paper is focused on querying NoSQL and relational schemas which are represented as U-Schema models. We present the SkiQL language designed on U-Schema to achieve a platform-independent schema query service. SkiQL provides two constructs: schema-query and relationship-query. The former allows to obtain information of entity or relationship types, and the latter that of the aggregations or references (relations among types). We will show how SkiQL was evaluated by calculating well-known metrics for languages as well as using a survey with developers with experience in NoSQL.
期刊介绍:
Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.