Reika Kinoshita, S. Imamura, Lukas Vogel, Satoshi Kazama, Eiji Yoshida
{"title":"公有云下异构分层存储管理的性价比评估","authors":"Reika Kinoshita, S. Imamura, Lukas Vogel, Satoshi Kazama, Eiji Yoshida","doi":"10.1109/CANDAR53791.2021.00024","DOIUrl":null,"url":null,"abstract":"Data analytics, which extracts valuable information from a large amount of data, plays an important role in a company's business decision-making. Data analytical processing is generally I/O-intensive because of the need to retrieve data from storage devices. Public cloud services have recently become a popular choice for data analytical processing because a wide variety of storage volumes is immediately available without preparing real hardware. In this type of public cloud, it is necessary to combine multiple types of storage volumes appropriately to obtain a high I/O throughput at a low cost. In this paper, using Amazon Web Services (AWS), we quantitatively evaluate the advantages of a state-of-the-art heterogeneous tierless storage management (HTSM) technique, that is designed for relational databases, over a traditional storage caching mechanism. Our evaluation with all types of Elastic Block Store (EBS) volumes and TPC-H and TPC-DS benchmarks shows that the HTSM technique outperforms Linux bcache by up to 3.35 times within specified cost constraints. Moreover, we demonstrate that it also mitigates the AWS-specific throughput degradations of storage volumes.","PeriodicalId":263773,"journal":{"name":"2021 Ninth International Symposium on Computing and Networking (CANDAR)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Cost-Performance Evaluation of Heterogeneous Tierless Storage Management in a Public Cloud\",\"authors\":\"Reika Kinoshita, S. Imamura, Lukas Vogel, Satoshi Kazama, Eiji Yoshida\",\"doi\":\"10.1109/CANDAR53791.2021.00024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data analytics, which extracts valuable information from a large amount of data, plays an important role in a company's business decision-making. Data analytical processing is generally I/O-intensive because of the need to retrieve data from storage devices. Public cloud services have recently become a popular choice for data analytical processing because a wide variety of storage volumes is immediately available without preparing real hardware. In this type of public cloud, it is necessary to combine multiple types of storage volumes appropriately to obtain a high I/O throughput at a low cost. In this paper, using Amazon Web Services (AWS), we quantitatively evaluate the advantages of a state-of-the-art heterogeneous tierless storage management (HTSM) technique, that is designed for relational databases, over a traditional storage caching mechanism. Our evaluation with all types of Elastic Block Store (EBS) volumes and TPC-H and TPC-DS benchmarks shows that the HTSM technique outperforms Linux bcache by up to 3.35 times within specified cost constraints. Moreover, we demonstrate that it also mitigates the AWS-specific throughput degradations of storage volumes.\",\"PeriodicalId\":263773,\"journal\":{\"name\":\"2021 Ninth International Symposium on Computing and Networking (CANDAR)\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Ninth International Symposium on Computing and Networking (CANDAR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CANDAR53791.2021.00024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Ninth International Symposium on Computing and Networking (CANDAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CANDAR53791.2021.00024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
数据分析从大量数据中提取有价值的信息,在公司的业务决策中起着重要作用。数据分析处理通常是I/ o密集型的,因为需要从存储设备检索数据。公共云服务最近成为数据分析处理的流行选择,因为各种各样的存储卷可以立即使用,而无需准备真正的硬件。在这种类型的公有云中,需要将多种类型的存储卷进行适当的组合,才能以较低的成本获得较高的I/O吞吐量。在本文中,使用Amazon Web Services (AWS),我们定量地评估了为关系数据库设计的最先进的异构分层存储管理(html)技术相对于传统存储缓存机制的优势。我们对所有类型的弹性块存储(EBS)卷以及TPC-H和TPC-DS基准测试的评估表明,在指定的成本限制下,html技术的性能比Linux bcache高出3.35倍。此外,我们还证明了它还可以减轻存储卷的aws特定吞吐量降低。
Cost-Performance Evaluation of Heterogeneous Tierless Storage Management in a Public Cloud
Data analytics, which extracts valuable information from a large amount of data, plays an important role in a company's business decision-making. Data analytical processing is generally I/O-intensive because of the need to retrieve data from storage devices. Public cloud services have recently become a popular choice for data analytical processing because a wide variety of storage volumes is immediately available without preparing real hardware. In this type of public cloud, it is necessary to combine multiple types of storage volumes appropriately to obtain a high I/O throughput at a low cost. In this paper, using Amazon Web Services (AWS), we quantitatively evaluate the advantages of a state-of-the-art heterogeneous tierless storage management (HTSM) technique, that is designed for relational databases, over a traditional storage caching mechanism. Our evaluation with all types of Elastic Block Store (EBS) volumes and TPC-H and TPC-DS benchmarks shows that the HTSM technique outperforms Linux bcache by up to 3.35 times within specified cost constraints. Moreover, we demonstrate that it also mitigates the AWS-specific throughput degradations of storage volumes.