Improving the performance of big data databases

D. R. Arif, Nzar Abdulqadir Ali
{"title":"Improving the performance of big data databases","authors":"D. R. Arif, Nzar Abdulqadir Ali","doi":"10.24017/science.2019.2.20","DOIUrl":null,"url":null,"abstract":"Real-time monitoring systems utilize two types of database, they are relational databases such as MySQL and non-relational databases such as MongoDB. A relational database management system (RDBMS) stores data in a structured format using rows and columns. It is relational because the values of the tables are connected. A non-relational database is a database that does not adopt the relational structure given by traditional. In recent years, this class of databases has also been referred to as Not only SQL (NoSQL).  This paper discusses many comparisons that have been conducted on the execution time performance of types of databases (SQL and NoSQL). In SQL (Structured Query Language) databases different algorithms are used for inserting and updating data, such as indexing, bulk insert and multiple updating. However, in NoSQL different algorithms are used for inserting and updating operations such as default-indexing, batch insert, multiple updating and pipeline aggregation. As a result, firstly compared with related papers, this paper shows that the performance of both SQL and NoSQL can be improved. Secondly, performance can be dramatically improved for inserting and updating operations in the NoSQL database compared to the SQL database. To demonstrate the performance of the different algorithms for entering and updating data in SQL and NoSQL, this paper focuses on a different number of data sets and different performance results. The SQL part of the paper is conducted on 50,000 records to 3,000,000 records, while the NoSQL part of the paper is conducted on 50,000 to 16,000,000 documents (2GB) for NoSQL. In SQL, three million records are inserted within 606.53 seconds, while in NoSQL this number of documents is inserted within 67.87 seconds. For updating data, in SQL 300,000 records are updated within 271.17 seconds, while for NoSQL this number of documents is updated within just 46.02 seconds. \n ","PeriodicalId":17866,"journal":{"name":"Kurdistan Journal of Applied Research","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Kurdistan Journal of Applied Research","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.24017/science.2019.2.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Real-time monitoring systems utilize two types of database, they are relational databases such as MySQL and non-relational databases such as MongoDB. A relational database management system (RDBMS) stores data in a structured format using rows and columns. It is relational because the values of the tables are connected. A non-relational database is a database that does not adopt the relational structure given by traditional. In recent years, this class of databases has also been referred to as Not only SQL (NoSQL).  This paper discusses many comparisons that have been conducted on the execution time performance of types of databases (SQL and NoSQL). In SQL (Structured Query Language) databases different algorithms are used for inserting and updating data, such as indexing, bulk insert and multiple updating. However, in NoSQL different algorithms are used for inserting and updating operations such as default-indexing, batch insert, multiple updating and pipeline aggregation. As a result, firstly compared with related papers, this paper shows that the performance of both SQL and NoSQL can be improved. Secondly, performance can be dramatically improved for inserting and updating operations in the NoSQL database compared to the SQL database. To demonstrate the performance of the different algorithms for entering and updating data in SQL and NoSQL, this paper focuses on a different number of data sets and different performance results. The SQL part of the paper is conducted on 50,000 records to 3,000,000 records, while the NoSQL part of the paper is conducted on 50,000 to 16,000,000 documents (2GB) for NoSQL. In SQL, three million records are inserted within 606.53 seconds, while in NoSQL this number of documents is inserted within 67.87 seconds. For updating data, in SQL 300,000 records are updated within 271.17 seconds, while for NoSQL this number of documents is updated within just 46.02 seconds.  
提升大数据数据库性能
实时监控系统使用两种类型的数据库,它们是关系数据库(如MySQL)和非关系数据库(例如MongoDB)。关系数据库管理系统(RDBMS)使用行和列以结构化格式存储数据。它是关系的,因为表的值是连接的。非关系数据库是指不采用传统关系结构的数据库。近年来,这类数据库也被称为不仅仅SQL(NoSQL)。本文讨论了对不同类型数据库(SQL和NoSQL)的执行时间性能进行的许多比较。在SQL(结构化查询语言)数据库中,不同的算法用于插入和更新数据,如索引、大容量插入和多次更新。然而,在NoSQL中,不同的算法用于插入和更新操作,如默认索引、批量插入、多次更新和管道聚合。因此,本文首先与相关论文进行了比较,表明SQL和NoSQL的性能都可以得到提高。其次,与SQL数据库相比,在NoSQL数据库中插入和更新操作的性能可以显著提高。为了演示SQL和NoSQL中输入和更新数据的不同算法的性能,本文重点研究了不同数量的数据集和不同的性能结果。论文的SQL部分针对50000到3000000条记录进行,而论文的NoSQL部分针对NoSQL针对50000到1600000个文档(2GB)进行。在SQL中,300万条记录在606.53秒内插入,而在NoSQL中,这一数量的文档在67.87秒内插入。对于更新数据,在SQL中,300000条记录在271.17秒内更新,而对于NoSQL,这一数量的文档仅在46.02秒内更新。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
16
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信