On the Performance Analysis of Map-Reduce Programming Model on In-Memory NoSQL Storage Platforms: A Case Study

Secil Yuzuk, Murat G. Aktaş, M. Aktaş
{"title":"On the Performance Analysis of Map-Reduce Programming Model on In-Memory NoSQL Storage Platforms: A Case Study","authors":"Secil Yuzuk, Murat G. Aktaş, M. Aktaş","doi":"10.1109/IBIGDELFT.2018.8625300","DOIUrl":null,"url":null,"abstract":"The financial data analysis, which is the road map of the future and at the same time the mirror of today, is of vital importance for many institutions. Therefore, it is common to apply statistical analysis on financial data. In such cases, data size becomes very important when performing financial data analysis. While analyzing the financial data, as the size and variety of data and increase, one can achieve the most accurate financial data analysis outcome. However, the increase in data size also brings some disadvantages such as performance-loss due to processing large-scale data. These disadvantages occur in both query performance and various functions that are used in data analysis. In this respect, it is necessary to examine the data storage platforms comparatively, which will investigate the performance of query and statistical functions, used in financial data analysis, at the highest level for large-scale financial data sets. For this purpose, the first step of this study was to compare the performance of the query on the Relational and Non-SQL-based storage environments, and to compare the performance of the query in the single-node and double-node in-memory NoSQL data storage environment. To facilitate testing of these platforms; as the SQL database system, MSSQL was selected and as the distributed in-memory NoSQL database system, Hazelcast was selected. For different data sizes on these platforms, the run times of the query and statistical functions were measured. In order to examine the ability of the in-memory NoSQL data storage platforms, to manage and manipulate the data, map-reduce programming model was used. Performance tests on single nodes and multiple nodes show that in-memory NoSQL platforms are very successful compared to relational database systems. In addition, it has been found that in-memory NoSQL storage platforms provide higher performance gains when using the map-reduce programming model.","PeriodicalId":290302,"journal":{"name":"2018 International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism (IBIGDELFT)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Congress on Big Data, Deep Learning and Fighting Cyber Terrorism (IBIGDELFT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IBIGDELFT.2018.8625300","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

The financial data analysis, which is the road map of the future and at the same time the mirror of today, is of vital importance for many institutions. Therefore, it is common to apply statistical analysis on financial data. In such cases, data size becomes very important when performing financial data analysis. While analyzing the financial data, as the size and variety of data and increase, one can achieve the most accurate financial data analysis outcome. However, the increase in data size also brings some disadvantages such as performance-loss due to processing large-scale data. These disadvantages occur in both query performance and various functions that are used in data analysis. In this respect, it is necessary to examine the data storage platforms comparatively, which will investigate the performance of query and statistical functions, used in financial data analysis, at the highest level for large-scale financial data sets. For this purpose, the first step of this study was to compare the performance of the query on the Relational and Non-SQL-based storage environments, and to compare the performance of the query in the single-node and double-node in-memory NoSQL data storage environment. To facilitate testing of these platforms; as the SQL database system, MSSQL was selected and as the distributed in-memory NoSQL database system, Hazelcast was selected. For different data sizes on these platforms, the run times of the query and statistical functions were measured. In order to examine the ability of the in-memory NoSQL data storage platforms, to manage and manipulate the data, map-reduce programming model was used. Performance tests on single nodes and multiple nodes show that in-memory NoSQL platforms are very successful compared to relational database systems. In addition, it has been found that in-memory NoSQL storage platforms provide higher performance gains when using the map-reduce programming model.
内存NoSQL存储平台上Map-Reduce编程模型的性能分析
金融数据分析是未来的路线图,同时也是今天的镜子,对许多机构来说至关重要。因此,对财务数据进行统计分析是很常见的。在这种情况下,在执行财务数据分析时,数据大小变得非常重要。在分析财务数据时,随着数据的规模和种类的增加,可以获得最准确的财务数据分析结果。但是,数据量的增加也带来了一些缺点,比如处理大规模数据会导致性能下降。这些缺点既存在于查询性能中,也存在于数据分析中使用的各种功能中。在这方面,有必要对数据存储平台进行比较研究,这将在最高水平上考察大规模金融数据集在金融数据分析中使用的查询和统计功能的性能。为此,本研究的第一步是比较基于关系型和基于非sql的存储环境下查询的性能,比较单节点和双节点内存NoSQL数据存储环境下查询的性能。协助测试这些平台;SQL数据库系统选用MSSQL,分布式内存NoSQL数据库系统选用Hazelcast。对于这些平台上不同的数据大小,测量了查询和统计函数的运行时间。为了检验内存NoSQL数据存储平台对数据进行管理和操作的能力,采用map-reduce编程模型。在单节点和多节点上的性能测试表明,与关系数据库系统相比,内存NoSQL平台非常成功。此外,在使用map-reduce编程模型时,已经发现内存中的NoSQL存储平台提供了更高的性能增益。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信