DataSteward:在公共云上使用专用计算节点进行可扩展的数据管理

2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications Pub Date : 2013-07-16 DOI:10.1109/TrustCom.2013.129

R. Tudoran, Alexandru Costan, Gabriel Antoniu

{"title":"DataSteward:在公共云上使用专用计算节点进行可扩展的数据管理","authors":"R. Tudoran, Alexandru Costan, Gabriel Antoniu","doi":"10.1109/TrustCom.2013.129","DOIUrl":null,"url":null,"abstract":"A large spectrum of scientific applications, some generating data volumes exceeding petabytes, are currently being ported on clouds to build on their inherent elasticity and scalability. One of the critical needs in order to deal with this \"data deluge\" is an efficient, scalable and reliable storage. However, the storage services proposed by cloud providers suffer from high latencies, trading performance for availability. One alternative is to federate the local virtual disks on the compute nodes into a globally shared storage used for large intermediate or checkpoint data. This collocated storage supports a high throughput but it can be very intrusive and subject to failures that can stop the host node and degrade the application performance. To deal with these limitations we propose DataSteward, a data management system that provides a higher degree of reliability while remaining non-intrusive through the use of dedicated compute nodes. DataSteward harnesses the storage space of a set of dedicated VMs, selected using a topology-aware clustering algorithm, and has a lifetime dependent on the deployment lifetime. To capitalize on this separation, we introduce a set of scientific data processing services on top of the storage layer, that can overlap with the executing applications. We performed extensive experimentations on hundreds of cores in the Azure cloud: compared to state-of-the-art node selection algorithms, we show up to a 20% higher throughput, which improves the overall performance of a real life scientific application up to 45%.","PeriodicalId":206739,"journal":{"name":"2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"DataSteward: Using Dedicated Compute Nodes for Scalable Data Management on Public Clouds\",\"authors\":\"R. Tudoran, Alexandru Costan, Gabriel Antoniu\",\"doi\":\"10.1109/TrustCom.2013.129\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A large spectrum of scientific applications, some generating data volumes exceeding petabytes, are currently being ported on clouds to build on their inherent elasticity and scalability. One of the critical needs in order to deal with this \\\"data deluge\\\" is an efficient, scalable and reliable storage. However, the storage services proposed by cloud providers suffer from high latencies, trading performance for availability. One alternative is to federate the local virtual disks on the compute nodes into a globally shared storage used for large intermediate or checkpoint data. This collocated storage supports a high throughput but it can be very intrusive and subject to failures that can stop the host node and degrade the application performance. To deal with these limitations we propose DataSteward, a data management system that provides a higher degree of reliability while remaining non-intrusive through the use of dedicated compute nodes. DataSteward harnesses the storage space of a set of dedicated VMs, selected using a topology-aware clustering algorithm, and has a lifetime dependent on the deployment lifetime. To capitalize on this separation, we introduce a set of scientific data processing services on top of the storage layer, that can overlap with the executing applications. We performed extensive experimentations on hundreds of cores in the Azure cloud: compared to state-of-the-art node selection algorithms, we show up to a 20% higher throughput, which improves the overall performance of a real life scientific application up to 45%.\",\"PeriodicalId\":206739,\"journal\":{\"name\":\"2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/TrustCom.2013.129\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TrustCom.2013.129","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

大量的科学应用程序，其中一些生成的数据量超过pb，目前正被移植到云上，以构建其固有的弹性和可扩展性。处理这种“数据洪水”的关键需求之一是高效、可扩展和可靠的存储。然而，云提供商提出的存储服务存在高延迟问题，需要牺牲性能来换取可用性。一种替代方法是将计算节点上的本地虚拟磁盘联合到用于大型中间数据或检查点数据的全局共享存储中。这种并置存储支持高吞吐量，但它可能非常具有侵入性，并且容易出现故障，可能导致主机节点停止并降低应用程序性能。为了解决这些限制，我们提出了DataSteward，这是一种数据管理系统，它提供了更高程度的可靠性，同时通过使用专用计算节点保持非侵入性。DataSteward利用一组专用虚拟机的存储空间，使用拓扑感知集群算法进行选择，其生存期取决于部署生存期。为了利用这种分离，我们在存储层之上引入了一组科学数据处理服务，这些服务可以与执行应用程序重叠。我们在Azure云上的数百个核心上进行了广泛的实验:与最先进的节点选择算法相比，我们的吞吐量提高了20%，这将现实生活中的科学应用程序的整体性能提高了45%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

DataSteward: Using Dedicated Compute Nodes for Scalable Data Management on Public Clouds

A large spectrum of scientific applications, some generating data volumes exceeding petabytes, are currently being ported on clouds to build on their inherent elasticity and scalability. One of the critical needs in order to deal with this "data deluge" is an efficient, scalable and reliable storage. However, the storage services proposed by cloud providers suffer from high latencies, trading performance for availability. One alternative is to federate the local virtual disks on the compute nodes into a globally shared storage used for large intermediate or checkpoint data. This collocated storage supports a high throughput but it can be very intrusive and subject to failures that can stop the host node and degrade the application performance. To deal with these limitations we propose DataSteward, a data management system that provides a higher degree of reliability while remaining non-intrusive through the use of dedicated compute nodes. DataSteward harnesses the storage space of a set of dedicated VMs, selected using a topology-aware clustering algorithm, and has a lifetime dependent on the deployment lifetime. To capitalize on this separation, we introduce a set of scientific data processing services on top of the storage layer, that can overlap with the executing applications. We performed extensive experimentations on hundreds of cores in the Azure cloud: compared to state-of-the-art node selection algorithms, we show up to a 20% higher throughput, which improves the overall performance of a real life scientific application up to 45%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications

自引率

0.00%

发文量