{"title":"通过环境和邻居感知线程管理减少尾端延迟","authors":"Andrew Jeffery, Chris Jensen, Richard Mortier","doi":"arxiv-2407.11582","DOIUrl":null,"url":null,"abstract":"Application tail latency is a key metric for many services, with high\nlatencies being linked directly to loss of revenue. Modern deeply-nested\nmicro-service architectures exacerbate tail latencies, increasing the\nlikelihood of users experiencing them. In this work, we show how CPU\novercommitment by OS threads leads to high tail latencies when applications are\nunder heavy load. CPU overcommitment can arise from two operational factors:\nincorrectly determining the number of CPUs available when under a CPU quota,\nand the ignorance of neighbour applications and their CPU usage. We discuss\ndifferent languages' solutions to obtaining the CPUs available, evaluating the\nimpact, and discuss opportunities for a more unified language-independent\ninterface to obtain the number of CPUs available. We then evaluate the impact\nof neighbour usage on tail latency and introduce a new neighbour-aware\nthreadpool, the friendlypool, that dynamically avoids overcommitment. In our\nevaluation, the friendlypool reduces maximum worker latency by up to\n$6.7\\times$ at the cost of decreasing throughput by up to $1.4\\times$.","PeriodicalId":501291,"journal":{"name":"arXiv - CS - Performance","volume":"2012 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reducing Tail Latencies Through Environment- and Neighbour-aware Thread Management\",\"authors\":\"Andrew Jeffery, Chris Jensen, Richard Mortier\",\"doi\":\"arxiv-2407.11582\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Application tail latency is a key metric for many services, with high\\nlatencies being linked directly to loss of revenue. Modern deeply-nested\\nmicro-service architectures exacerbate tail latencies, increasing the\\nlikelihood of users experiencing them. In this work, we show how CPU\\novercommitment by OS threads leads to high tail latencies when applications are\\nunder heavy load. CPU overcommitment can arise from two operational factors:\\nincorrectly determining the number of CPUs available when under a CPU quota,\\nand the ignorance of neighbour applications and their CPU usage. We discuss\\ndifferent languages' solutions to obtaining the CPUs available, evaluating the\\nimpact, and discuss opportunities for a more unified language-independent\\ninterface to obtain the number of CPUs available. We then evaluate the impact\\nof neighbour usage on tail latency and introduce a new neighbour-aware\\nthreadpool, the friendlypool, that dynamically avoids overcommitment. In our\\nevaluation, the friendlypool reduces maximum worker latency by up to\\n$6.7\\\\times$ at the cost of decreasing throughput by up to $1.4\\\\times$.\",\"PeriodicalId\":501291,\"journal\":{\"name\":\"arXiv - CS - Performance\",\"volume\":\"2012 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Performance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2407.11582\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.11582","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
应用程序尾端延迟是许多服务的关键指标,高延迟与收入损失直接相关。现代深嵌套微服务架构加剧了尾部延迟,增加了用户遇到尾部延迟的可能性。在这项工作中,我们展示了操作系统线程对 CPU 的过度承诺如何在应用程序处于重负载时导致高尾延迟。CPU 过度承诺可能源于两个操作因素:在 CPU 配额下错误地确定可用 CPU 的数量,以及对相邻应用程序及其 CPU 使用情况的不了解。我们讨论了不同语言获取可用 CPU 的解决方案,评估了其影响,并讨论了建立一个更统一的、与语言无关的接口来获取可用 CPU 数量的可能性。然后,我们评估了邻居使用对尾部延迟的影响,并引入了一种新的邻居感知线程池--友好线程池(friendlypool),它可以动态避免过度承诺。在我们的评估中,友好线程池以降低吞吐量达 1.4 美元/次为代价,将最大工作者延迟降低了 6.7 美元/次。
Reducing Tail Latencies Through Environment- and Neighbour-aware Thread Management
Application tail latency is a key metric for many services, with high
latencies being linked directly to loss of revenue. Modern deeply-nested
micro-service architectures exacerbate tail latencies, increasing the
likelihood of users experiencing them. In this work, we show how CPU
overcommitment by OS threads leads to high tail latencies when applications are
under heavy load. CPU overcommitment can arise from two operational factors:
incorrectly determining the number of CPUs available when under a CPU quota,
and the ignorance of neighbour applications and their CPU usage. We discuss
different languages' solutions to obtaining the CPUs available, evaluating the
impact, and discuss opportunities for a more unified language-independent
interface to obtain the number of CPUs available. We then evaluate the impact
of neighbour usage on tail latency and introduce a new neighbour-aware
threadpool, the friendlypool, that dynamically avoids overcommitment. In our
evaluation, the friendlypool reduces maximum worker latency by up to
$6.7\times$ at the cost of decreasing throughput by up to $1.4\times$.