基于分布式哈希表的新型分布式NoSQL数据库设计

Log. J. IGPL Pub Date : 2021-03-17 DOI:10.1093/JIGPAL/JZAB003

Agustín San Román Guzmán, Diego Valdeolmillos, Alberto Rivas, Angélica González Arrieta, P. Chamoso

{"title":"基于分布式哈希表的新型分布式NoSQL数据库设计","authors":"Agustín San Román Guzmán, Diego Valdeolmillos, Alberto Rivas, Angélica González Arrieta, P. Chamoso","doi":"10.1093/JIGPAL/JZAB003","DOIUrl":null,"url":null,"abstract":"\n Databases play a fundamental role in today’s world, being used by most companies, especially those that offer services through the Internet. Today there is a wide variety of database models, each adapted for use according to the specific requirements of each application. Traditionally, the relational models with centralized architectures have been used mostly due to their simplicity and general-purpose query language, which made relational systems suitable for almost any application. However, with the growth of the Internet in recent decades, both in the number of users and in the amount of information, those centralized models began to suffer availability and scalability issues. To address those issues, the use of decentralized architectures and alternative database models began to arise, eventually replacing relational databases and centralized architectures when the requirements on availability and scalability are high. Those database models alternative to the traditional relational model are grouped under the name of NoSQL (Not only Structured Query Language). In this article, we present a NoSQL database developed as an end of degree work, with a flexible data model based on documents and a fully decentralized architecture based on the Gossip protocol for node discovery and a distributed hash table, in particular the rendezvous hashing algorithm, used to distribute and replicate the data across all the nodes. The main goals of the system are to achieve high availability (the data should be almost always accessible) and high scalability (the system should be able to scale by increasing the number of nodes to increase its capacity both on data and number of users). High availability is achieved thanks to the replication of the data, while high scalability is achieved by its decentralized architecture, which allows multiple entry points from the requests, and the data distribution, effectively increasing the database capacity by increasing the number of nodes.","PeriodicalId":304915,"journal":{"name":"Log. J. IGPL","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Design of a New Distributed NoSQL Database with Distributed Hash Tables\",\"authors\":\"Agustín San Román Guzmán, Diego Valdeolmillos, Alberto Rivas, Angélica González Arrieta, P. Chamoso\",\"doi\":\"10.1093/JIGPAL/JZAB003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n Databases play a fundamental role in today’s world, being used by most companies, especially those that offer services through the Internet. Today there is a wide variety of database models, each adapted for use according to the specific requirements of each application. Traditionally, the relational models with centralized architectures have been used mostly due to their simplicity and general-purpose query language, which made relational systems suitable for almost any application. However, with the growth of the Internet in recent decades, both in the number of users and in the amount of information, those centralized models began to suffer availability and scalability issues. To address those issues, the use of decentralized architectures and alternative database models began to arise, eventually replacing relational databases and centralized architectures when the requirements on availability and scalability are high. Those database models alternative to the traditional relational model are grouped under the name of NoSQL (Not only Structured Query Language). In this article, we present a NoSQL database developed as an end of degree work, with a flexible data model based on documents and a fully decentralized architecture based on the Gossip protocol for node discovery and a distributed hash table, in particular the rendezvous hashing algorithm, used to distribute and replicate the data across all the nodes. The main goals of the system are to achieve high availability (the data should be almost always accessible) and high scalability (the system should be able to scale by increasing the number of nodes to increase its capacity both on data and number of users). High availability is achieved thanks to the replication of the data, while high scalability is achieved by its decentralized architecture, which allows multiple entry points from the requests, and the data distribution, effectively increasing the database capacity by increasing the number of nodes.\",\"PeriodicalId\":304915,\"journal\":{\"name\":\"Log. J. IGPL\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-03-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Log. J. IGPL\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/JIGPAL/JZAB003\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Log. J. IGPL","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/JIGPAL/JZAB003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

数据库在当今世界扮演着重要的角色，被大多数公司所使用，尤其是那些通过Internet提供服务的公司。现在有各种各样的数据库模型，每个模型都根据每个应用程序的特定需求进行了调整。传统上，使用集中式体系结构的关系模型主要是因为它们的简单性和通用查询语言，这使得关系系统几乎适用于任何应用程序。然而，随着近几十年来Internet的发展，无论是用户数量还是信息量，这些集中式模型都开始出现可用性和可伸缩性问题。为了解决这些问题，分散式体系结构和替代数据库模型的使用开始出现，当对可用性和可伸缩性的要求很高时，它们最终取代了关系数据库和集中式体系结构。这些替代传统关系模型的数据库模型统称为NoSQL (Not only Structured Query Language)。在这篇文章中，我们提出了一个NoSQL数据库，它是一个基于文档的灵活的数据模型，一个基于Gossip协议的完全分散的架构，用于节点发现和分布式哈希表，特别是集合哈希算法，用于在所有节点上分发和复制数据。系统的主要目标是实现高可用性(数据应该几乎总是可访问的)和高可伸缩性(系统应该能够通过增加节点的数量来增加其数据和用户数量的容量)。由于数据的复制，实现了高可用性，而其分散的体系结构实现了高可伸缩性，该体系结构允许来自请求和数据分布的多个入口点，通过增加节点数量有效地增加了数据库容量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Design of a New Distributed NoSQL Database with Distributed Hash Tables

Databases play a fundamental role in today’s world, being used by most companies, especially those that offer services through the Internet. Today there is a wide variety of database models, each adapted for use according to the specific requirements of each application. Traditionally, the relational models with centralized architectures have been used mostly due to their simplicity and general-purpose query language, which made relational systems suitable for almost any application. However, with the growth of the Internet in recent decades, both in the number of users and in the amount of information, those centralized models began to suffer availability and scalability issues. To address those issues, the use of decentralized architectures and alternative database models began to arise, eventually replacing relational databases and centralized architectures when the requirements on availability and scalability are high. Those database models alternative to the traditional relational model are grouped under the name of NoSQL (Not only Structured Query Language). In this article, we present a NoSQL database developed as an end of degree work, with a flexible data model based on documents and a fully decentralized architecture based on the Gossip protocol for node discovery and a distributed hash table, in particular the rendezvous hashing algorithm, used to distribute and replicate the data across all the nodes. The main goals of the system are to achieve high availability (the data should be almost always accessible) and high scalability (the system should be able to scale by increasing the number of nodes to increase its capacity both on data and number of users). High availability is achieved thanks to the replication of the data, while high scalability is achieved by its decentralized architecture, which allows multiple entry points from the requests, and the data distribution, effectively increasing the database capacity by increasing the number of nodes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Log. J. IGPL

自引率

0.00%

发文量