Redefining Data Locality for Cross-Data Center Storage

Kwangsung Oh, A. Raghavan, A. Chandra, J. Weissman
{"title":"Redefining Data Locality for Cross-Data Center Storage","authors":"Kwangsung Oh, A. Raghavan, A. Chandra, J. Weissman","doi":"10.1145/2756594.2756596","DOIUrl":null,"url":null,"abstract":"Many Cloud applications exploit the diversity of storage options in a data center to achieve desired cost, performance, and durability tradeoffs. It is common to see applications using a combination of memory, local disk, and archival storage tiers within a single data center to meet their needs. For example, hot data can be kept in memory using ElastiCache, and colder data in cheaper, slower storage such as S3, using Amazon as an example. For user-facing applications, a recent trend is to exploit multiple data centers for data placement to enable better latency of access from users to their data. The conventional wisdom is that co-location of computation and storage within the same data center is a key to application performance, so that applications running within a data center are often still limited to access local data. In this paper, using experiments on Amazon, Microsoft, and Google clouds, we show that this assumption is false, and that accessing data in nearby data centers may be faster than local access at different or even same points in the storage hierarchy. This can lead to not only better performance, but also reduced cost, simpler consistency policies and reconsidering data locality in multiple DCs environment. This argues for an expansion of cloud storage tiers to consider non-local storage options, and has interesting implications for the design of a distributed storage system.","PeriodicalId":283088,"journal":{"name":"Proceedings of the 2nd International Workshop on Software-Defined Ecosystems","volume":"105 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd International Workshop on Software-Defined Ecosystems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2756594.2756596","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Many Cloud applications exploit the diversity of storage options in a data center to achieve desired cost, performance, and durability tradeoffs. It is common to see applications using a combination of memory, local disk, and archival storage tiers within a single data center to meet their needs. For example, hot data can be kept in memory using ElastiCache, and colder data in cheaper, slower storage such as S3, using Amazon as an example. For user-facing applications, a recent trend is to exploit multiple data centers for data placement to enable better latency of access from users to their data. The conventional wisdom is that co-location of computation and storage within the same data center is a key to application performance, so that applications running within a data center are often still limited to access local data. In this paper, using experiments on Amazon, Microsoft, and Google clouds, we show that this assumption is false, and that accessing data in nearby data centers may be faster than local access at different or even same points in the storage hierarchy. This can lead to not only better performance, but also reduced cost, simpler consistency policies and reconsidering data locality in multiple DCs environment. This argues for an expansion of cloud storage tiers to consider non-local storage options, and has interesting implications for the design of a distributed storage system.
重新定义跨数据中心存储的数据位置
许多云应用程序利用数据中心中存储选项的多样性来实现所需的成本、性能和持久性权衡。应用程序在单个数据中心内使用内存、本地磁盘和归档存储层的组合来满足其需求是很常见的。例如,热数据可以使用ElastiCache保存在内存中,冷数据可以保存在更便宜、更慢的存储(如S3)中,以Amazon为例。对于面向用户的应用程序,最近的一个趋势是利用多个数据中心进行数据放置,以提高用户对其数据的访问延迟。传统观点认为,计算和存储在同一数据中心内的共存位置是提高应用程序性能的关键,因此在数据中心内运行的应用程序通常仍然仅限于访问本地数据。在本文中,通过对Amazon、Microsoft和Google云的实验,我们证明了这个假设是错误的,并且访问附近数据中心的数据可能比访问存储层次结构中不同甚至相同点的本地数据更快。这不仅可以提高性能,还可以降低成本,简化一致性策略,并重新考虑多个数据中心环境中的数据位置。这就要求扩展云存储层以考虑非本地存储选项,并对分布式存储系统的设计产生有趣的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信