线程每核架构对应用程序尾部延迟的影响

Pekka Enberg, Ashwin Rao, S. Tarkoma
{"title":"线程每核架构对应用程序尾部延迟的影响","authors":"Pekka Enberg, Ashwin Rao, S. Tarkoma","doi":"10.1109/ANCS.2019.8901874","DOIUrl":null,"url":null,"abstract":"The response time of an online service depends on the tail latency of a few of the applications it invokes in parallel to satisfy the requests. The individual applications are composed of one or more threads to fully utilize the available CPU cores, but this approach can incur serious overheads. The thread-per-core architecture has emerged to reduce these overheads, but it also has its challenges from thread synchronization and OS interfaces. Applications can mitigate both issues with different techniques, but their impact on application tail latency is an open question. We measure the impact of thread-per-core architecture on application tail latency by implementing a key-value store that uses application-level partitioning, and inter-thread messaging and compare its tail latency to Memcached which uses a traditional key-value store design. We show in an experimental evaluation that our approach reduces tail latency by up to 71 % compared to baseline Memcached running on commodity hardware and Linux. However, we observe that the thread-per-core approach is held back by request steering and OS interfaces, and it could be further improved with NIC hardware offload.","PeriodicalId":405320,"journal":{"name":"2019 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"The Impact of Thread-Per-Core Architecture on Application Tail Latency\",\"authors\":\"Pekka Enberg, Ashwin Rao, S. Tarkoma\",\"doi\":\"10.1109/ANCS.2019.8901874\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The response time of an online service depends on the tail latency of a few of the applications it invokes in parallel to satisfy the requests. The individual applications are composed of one or more threads to fully utilize the available CPU cores, but this approach can incur serious overheads. The thread-per-core architecture has emerged to reduce these overheads, but it also has its challenges from thread synchronization and OS interfaces. Applications can mitigate both issues with different techniques, but their impact on application tail latency is an open question. We measure the impact of thread-per-core architecture on application tail latency by implementing a key-value store that uses application-level partitioning, and inter-thread messaging and compare its tail latency to Memcached which uses a traditional key-value store design. We show in an experimental evaluation that our approach reduces tail latency by up to 71 % compared to baseline Memcached running on commodity hardware and Linux. However, we observe that the thread-per-core approach is held back by request steering and OS interfaces, and it could be further improved with NIC hardware offload.\",\"PeriodicalId\":405320,\"journal\":{\"name\":\"2019 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ANCS.2019.8901874\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 ACM/IEEE Symposium on Architectures for Networking and Communications Systems (ANCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ANCS.2019.8901874","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

摘要

在线服务的响应时间取决于为满足请求而并行调用的几个应用程序的尾延迟。单个应用程序由一个或多个线程组成,以充分利用可用的CPU内核,但这种方法可能会导致严重的开销。每核线程架构的出现是为了减少这些开销,但它也面临线程同步和操作系统接口方面的挑战。应用程序可以使用不同的技术来缓解这两个问题,但是它们对应用程序尾部延迟的影响是一个悬而未决的问题。我们通过实现一个使用应用程序级分区的键值存储和线程间消息传递来衡量每核线程架构对应用程序尾部延迟的影响,并将其尾部延迟与使用传统键值存储设计的Memcached进行比较。我们在一个实验评估中表明,与在商用硬件和Linux上运行Memcached的基线相比,我们的方法将尾部延迟减少了71%。然而,我们观察到每核线程的方法受到请求转向和操作系统接口的阻碍,并且可以通过NIC硬件卸载进一步改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The Impact of Thread-Per-Core Architecture on Application Tail Latency
The response time of an online service depends on the tail latency of a few of the applications it invokes in parallel to satisfy the requests. The individual applications are composed of one or more threads to fully utilize the available CPU cores, but this approach can incur serious overheads. The thread-per-core architecture has emerged to reduce these overheads, but it also has its challenges from thread synchronization and OS interfaces. Applications can mitigate both issues with different techniques, but their impact on application tail latency is an open question. We measure the impact of thread-per-core architecture on application tail latency by implementing a key-value store that uses application-level partitioning, and inter-thread messaging and compare its tail latency to Memcached which uses a traditional key-value store design. We show in an experimental evaluation that our approach reduces tail latency by up to 71 % compared to baseline Memcached running on commodity hardware and Linux. However, we observe that the thread-per-core approach is held back by request steering and OS interfaces, and it could be further improved with NIC hardware offload.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信