时间都去哪儿了?表征memcached中的尾部延迟

2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS) Pub Date : 2015-03-29 DOI:10.1109/ISPASS.2015.7095781

G. Blake, A. Saidi

{"title":"时间都去哪儿了?表征memcached中的尾部延迟","authors":"G. Blake, A. Saidi","doi":"10.1109/ISPASS.2015.7095781","DOIUrl":null,"url":null,"abstract":"To function correctly Online, Data-Intensive (OLDI) services require low and consistent service times. Maintaining predictable service times entails requiring 99th or higher percentile latency targets across hundreds to thousands of servers in the data-center. However, to maintain the 99th percentile targets servers are routinely run well below full utilization. The main difficulty in optimizing a server to run closer to peak utilization and maintain predictable 99th percentile response latencies is identifying and mitigating the causes of a request missing the target service time. In practice this analysis is challenging as requests and responses overlap their execution with respect to one another and traverse multiple layers of software, user/kernel protection boundaries, and the hardware/software divide. Traditional profiling methods that record the time being spent in each function usually yield few clues as to where a bottleneck may be present due to the many layers of software each consuming only a small fraction of time each. In this work we analyze the end-to-end sources of latency in a Memcached server from the wire through the kernel into the application and back again. To do so, we develop a tool that utilizes the Linux SystemTap infrastructure to measure latency throughout the many software layers that make up the complete request and response path for Memcached. While memory copies and the Linux networking stack are often suggested as major contributors to latency, we find that the main cause of missing response latency guarantees is the formation of standing queues and the application's inability to detect and remedy this situation.","PeriodicalId":189378,"journal":{"name":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Where does the time go? characterizing tail latency in memcached\",\"authors\":\"G. Blake, A. Saidi\",\"doi\":\"10.1109/ISPASS.2015.7095781\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To function correctly Online, Data-Intensive (OLDI) services require low and consistent service times. Maintaining predictable service times entails requiring 99th or higher percentile latency targets across hundreds to thousands of servers in the data-center. However, to maintain the 99th percentile targets servers are routinely run well below full utilization. The main difficulty in optimizing a server to run closer to peak utilization and maintain predictable 99th percentile response latencies is identifying and mitigating the causes of a request missing the target service time. In practice this analysis is challenging as requests and responses overlap their execution with respect to one another and traverse multiple layers of software, user/kernel protection boundaries, and the hardware/software divide. Traditional profiling methods that record the time being spent in each function usually yield few clues as to where a bottleneck may be present due to the many layers of software each consuming only a small fraction of time each. In this work we analyze the end-to-end sources of latency in a Memcached server from the wire through the kernel into the application and back again. To do so, we develop a tool that utilizes the Linux SystemTap infrastructure to measure latency throughout the many software layers that make up the complete request and response path for Memcached. While memory copies and the Linux networking stack are often suggested as major contributors to latency, we find that the main cause of missing response latency guarantees is the formation of standing queues and the application's inability to detect and remedy this situation.\",\"PeriodicalId\":189378,\"journal\":{\"name\":\"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)\",\"volume\":\"72 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-03-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISPASS.2015.7095781\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2015.7095781","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

为了在线正常工作，数据密集型(OLDI)业务需要较低且一致的服务时间。维护可预测的服务时间需要在数据中心的数百到数千台服务器上设置99%或更高百分比的延迟目标。然而，为了维护第99个百分位数的目标，服务器通常在完全利用率之下运行。优化服务器以使其运行更接近峰值利用率并保持可预测的第99百分位响应延迟的主要困难是识别和减轻请求错过目标服务时间的原因。在实践中，这种分析是具有挑战性的，因为请求和响应的执行相互重叠，并且跨越多个软件层、用户/内核保护边界和硬件/软件分界线。传统的分析方法记录了在每个功能上花费的时间，由于软件的许多层每个只消耗一小部分时间，因此通常很少产生关于瓶颈可能存在的线索。在本文中，我们将分析Memcached服务器中从连接到内核到应用程序再返回的端到端延迟源。为此，我们开发了一个工具，该工具利用Linux SystemTap基础设施来测量构成Memcached完整请求和响应路径的许多软件层中的延迟。虽然通常认为内存副本和Linux网络堆栈是造成延迟的主要原因，但我们发现，缺少响应延迟保证的主要原因是排队的形成以及应用程序无法检测和纠正这种情况。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Where does the time go? characterizing tail latency in memcached

To function correctly Online, Data-Intensive (OLDI) services require low and consistent service times. Maintaining predictable service times entails requiring 99th or higher percentile latency targets across hundreds to thousands of servers in the data-center. However, to maintain the 99th percentile targets servers are routinely run well below full utilization. The main difficulty in optimizing a server to run closer to peak utilization and maintain predictable 99th percentile response latencies is identifying and mitigating the causes of a request missing the target service time. In practice this analysis is challenging as requests and responses overlap their execution with respect to one another and traverse multiple layers of software, user/kernel protection boundaries, and the hardware/software divide. Traditional profiling methods that record the time being spent in each function usually yield few clues as to where a bottleneck may be present due to the many layers of software each consuming only a small fraction of time each. In this work we analyze the end-to-end sources of latency in a Memcached server from the wire through the kernel into the application and back again. To do so, we develop a tool that utilizes the Linux SystemTap infrastructure to measure latency throughout the many software layers that make up the complete request and response path for Memcached. While memory copies and the Linux networking stack are often suggested as major contributors to latency, we find that the main cause of missing response latency guarantees is the formation of standing queues and the application's inability to detect and remedy this situation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

自引率

0.00%

发文量