望远镜:tb级遥测技术

Alan Nair, Sandeep Kumar, Aravinda Prasad, Andy Rudoff, Sreenivas Subramoney
{"title":"望远镜:tb级遥测技术","authors":"Alan Nair, Sandeep Kumar, Aravinda Prasad, Andy Rudoff, Sreenivas Subramoney","doi":"arxiv-2311.10275","DOIUrl":null,"url":null,"abstract":"Data-hungry applications that require terabytes of memory have become\nwidespread in recent years. To meet the memory needs of these applications,\ndata centers are embracing tiered memory architectures with near and far memory\ntiers. Precise, efficient, and timely identification of hot and cold data and\ntheir placement in appropriate tiers is critical for performance in such\nsystems. Unfortunately, the existing state-of-the-art telemetry techniques for\nhot and cold data detection are ineffective at the terabyte scale. We propose Telescope, a novel technique that profiles different levels of the\napplication's page table tree for fast and efficient identification of hot and\ncold data. Telescope is based on the observation that, for a memory- and\nTLB-intensive workload, higher levels of a page table tree are also frequently\naccessed during a hardware page table walk. Hence, the hotness of the higher\nlevels of the page table tree essentially captures the hotness of its subtrees\nor address space sub-regions at a coarser granularity. We exploit this insight\nto quickly converge on even a few megabytes of hot data and efficiently\nidentify several gigabytes of cold data in terabyte-scale applications.\nImportantly, such a technique can seamlessly scale to petabyte-scale\napplications. Telescope's telemetry achieves 90%+ precision and recall at just 0.009%\nsingle CPU utilization for microbenchmarks with a 5 TB memory footprint. Memory\ntiering based on Telescope results in 5.6% to 34% throughput improvement for\nreal-world benchmarks with a 1-2 TB memory footprint compared to other\nstate-of-the-art telemetry techniques.","PeriodicalId":501333,"journal":{"name":"arXiv - CS - Operating Systems","volume":"23 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Telescope: Telemetry at Terabyte Scale\",\"authors\":\"Alan Nair, Sandeep Kumar, Aravinda Prasad, Andy Rudoff, Sreenivas Subramoney\",\"doi\":\"arxiv-2311.10275\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data-hungry applications that require terabytes of memory have become\\nwidespread in recent years. To meet the memory needs of these applications,\\ndata centers are embracing tiered memory architectures with near and far memory\\ntiers. Precise, efficient, and timely identification of hot and cold data and\\ntheir placement in appropriate tiers is critical for performance in such\\nsystems. Unfortunately, the existing state-of-the-art telemetry techniques for\\nhot and cold data detection are ineffective at the terabyte scale. We propose Telescope, a novel technique that profiles different levels of the\\napplication's page table tree for fast and efficient identification of hot and\\ncold data. Telescope is based on the observation that, for a memory- and\\nTLB-intensive workload, higher levels of a page table tree are also frequently\\naccessed during a hardware page table walk. Hence, the hotness of the higher\\nlevels of the page table tree essentially captures the hotness of its subtrees\\nor address space sub-regions at a coarser granularity. We exploit this insight\\nto quickly converge on even a few megabytes of hot data and efficiently\\nidentify several gigabytes of cold data in terabyte-scale applications.\\nImportantly, such a technique can seamlessly scale to petabyte-scale\\napplications. Telescope's telemetry achieves 90%+ precision and recall at just 0.009%\\nsingle CPU utilization for microbenchmarks with a 5 TB memory footprint. Memory\\ntiering based on Telescope results in 5.6% to 34% throughput improvement for\\nreal-world benchmarks with a 1-2 TB memory footprint compared to other\\nstate-of-the-art telemetry techniques.\",\"PeriodicalId\":501333,\"journal\":{\"name\":\"arXiv - CS - Operating Systems\",\"volume\":\"23 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Operating Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2311.10275\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Operating Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2311.10275","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

近年来,需要tb内存的数据饥渴型应用程序变得非常普遍。为了满足这些应用程序的内存需求,数据中心正在采用具有近内存层和远内存层的分层内存架构。精确、高效、及时地识别热数据和冷数据,并将其放置在适当的层中,对于此类系统的性能至关重要。不幸的是,现有的最先进的遥测技术用于热数据和冷数据检测在太字节规模上是无效的。我们提出了Telescope,这是一种新颖的技术,它描述了应用程序页面表树的不同层次,以便快速有效地识别热数据和冷数据。Telescope基于以下观察:对于内存和tlb密集型工作负载,在硬件页表遍历期间也经常访问页表树的更高级别。因此,页表树较高层的热度实际上是以较粗粒度捕获其子树或地址空间子区域的热度。我们利用这种洞察力,快速地集中在几兆字节的热数据上,并在tb级应用程序中有效地识别几兆字节的冷数据。重要的是,这种技术可以无缝地扩展到pb级的应用程序。在5 TB内存占用的微基准测试中,Telescope的遥测技术实现了90%以上的精度和仅0.009%的单CPU利用率。与其他最先进的遥测技术相比,在1-2 TB内存占用的实际基准测试中,基于Telescope的内存分层使吞吐量提高了5.6%至34%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Telescope: Telemetry at Terabyte Scale
Data-hungry applications that require terabytes of memory have become widespread in recent years. To meet the memory needs of these applications, data centers are embracing tiered memory architectures with near and far memory tiers. Precise, efficient, and timely identification of hot and cold data and their placement in appropriate tiers is critical for performance in such systems. Unfortunately, the existing state-of-the-art telemetry techniques for hot and cold data detection are ineffective at the terabyte scale. We propose Telescope, a novel technique that profiles different levels of the application's page table tree for fast and efficient identification of hot and cold data. Telescope is based on the observation that, for a memory- and TLB-intensive workload, higher levels of a page table tree are also frequently accessed during a hardware page table walk. Hence, the hotness of the higher levels of the page table tree essentially captures the hotness of its subtrees or address space sub-regions at a coarser granularity. We exploit this insight to quickly converge on even a few megabytes of hot data and efficiently identify several gigabytes of cold data in terabyte-scale applications. Importantly, such a technique can seamlessly scale to petabyte-scale applications. Telescope's telemetry achieves 90%+ precision and recall at just 0.009% single CPU utilization for microbenchmarks with a 5 TB memory footprint. Memory tiering based on Telescope results in 5.6% to 34% throughput improvement for real-world benchmarks with a 1-2 TB memory footprint compared to other state-of-the-art telemetry techniques.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信