Ta-Chun Lo, Chun-Ying Tao, Jyh-Biau Chang, C. Shieh
{"title":"Performance Comparison of Containerized HBase Clusters on Kubernetes","authors":"Ta-Chun Lo, Chun-Ying Tao, Jyh-Biau Chang, C. Shieh","doi":"10.1109/RASSE54974.2022.9989814","DOIUrl":null,"url":null,"abstract":"The demand for large-volume database storage has become an essential issue with the rising trend of big data. Since the NoSQL database performs better than SQL databases when handling extensive data, many developers choose the NoSQL database as their first choice. Among all the NoSQL databases, HBase has become a popular choice due to its flexibility and high efficiency in the big data processing field. HBase is a column-oriented NoSQL database. It uses HDFS storage and is suitable for integrating with Hadoop ecosystem applications. However, deploying an HBase cluster on bare metal or virtual machines could be pretty complicated and time-consuming. The container technology can make HBase installation more convenient. Nevertheless, containerized HBase can be deployed in different ways. Deploying the HBase cluster in a proper approach can achieve higher performance. In this research, we propose two approaches, namely the Container-dedicated approach and the Container-shared approach, to containerize HBase on Kubernetes. Two benchmark tools are used to compare their performance under different workloads. According to experiment results, the Container-dedicated approach is suitable for writeheavy and read/write balanced applications. The container-shared approach shows a better performance in read-heavy applications. The test result will give future developers a reference when designing a containerized HBase cluster.","PeriodicalId":382440,"journal":{"name":"2022 IEEE International Conference on Recent Advances in Systems Science and Engineering (RASSE)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Recent Advances in Systems Science and Engineering (RASSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RASSE54974.2022.9989814","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The demand for large-volume database storage has become an essential issue with the rising trend of big data. Since the NoSQL database performs better than SQL databases when handling extensive data, many developers choose the NoSQL database as their first choice. Among all the NoSQL databases, HBase has become a popular choice due to its flexibility and high efficiency in the big data processing field. HBase is a column-oriented NoSQL database. It uses HDFS storage and is suitable for integrating with Hadoop ecosystem applications. However, deploying an HBase cluster on bare metal or virtual machines could be pretty complicated and time-consuming. The container technology can make HBase installation more convenient. Nevertheless, containerized HBase can be deployed in different ways. Deploying the HBase cluster in a proper approach can achieve higher performance. In this research, we propose two approaches, namely the Container-dedicated approach and the Container-shared approach, to containerize HBase on Kubernetes. Two benchmark tools are used to compare their performance under different workloads. According to experiment results, the Container-dedicated approach is suitable for writeheavy and read/write balanced applications. The container-shared approach shows a better performance in read-heavy applications. The test result will give future developers a reference when designing a containerized HBase cluster.