Implementation of an efficient RDMA mechanism tightly coupled with a TCP/IP offload engine

Hankook Jang, Sang-Hwa Chung, Dae-Hyun Yoo
{"title":"Implementation of an efficient RDMA mechanism tightly coupled with a TCP/IP offload engine","authors":"Hankook Jang, Sang-Hwa Chung, Dae-Hyun Yoo","doi":"10.1109/SIES.2008.4577684","DOIUrl":null,"url":null,"abstract":"We develop a hybrid TCP/IP offload engine (hybrid TOE) that processes TCP/IP via hardware/software coprocessing based on an FPGA and a general-purpose embedded processor. We also develop an efficient remote direct memory access (RDMA) mechanism that is tightly coupled with the hybrid TOE. In this mechanism, the hybrid TOE performs CRC calculations using hardware modules and supports zero-copy data transmission; the host CPU simply generates and processes RDMA protocol headers. By using the hybrid TOE and the RDMA mechanism, computer systems can achieve good network performance with very low CPU utilizations, and thus they can be expected to show a great improvement in overall performance. In experiments on a gigabit Ethernet network, although the embedded processor operated with a 300 MHz core clock, which was one-seventh the speed of the host CPUpsilas clock, the hybrid TOE showed a minimum latency of 17.4 mus and a maximum bandwidth of 736 Mbps. The RDMA mechanism exhibited a minimum latency of 20.6 mus and a maximum bandwidth of 642 Mbps. Most importantly, the hybrid TOE and the TOE-based RDMA mechanism gave CPU utilizations of less than 5.6% and 8.4%, respectively-approximately one-tenth the utilizations when TCP/IP and TCP/IP-based RDMA were processed by the host CPU.","PeriodicalId":438401,"journal":{"name":"2008 International Symposium on Industrial Embedded Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Symposium on Industrial Embedded Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIES.2008.4577684","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

We develop a hybrid TCP/IP offload engine (hybrid TOE) that processes TCP/IP via hardware/software coprocessing based on an FPGA and a general-purpose embedded processor. We also develop an efficient remote direct memory access (RDMA) mechanism that is tightly coupled with the hybrid TOE. In this mechanism, the hybrid TOE performs CRC calculations using hardware modules and supports zero-copy data transmission; the host CPU simply generates and processes RDMA protocol headers. By using the hybrid TOE and the RDMA mechanism, computer systems can achieve good network performance with very low CPU utilizations, and thus they can be expected to show a great improvement in overall performance. In experiments on a gigabit Ethernet network, although the embedded processor operated with a 300 MHz core clock, which was one-seventh the speed of the host CPUpsilas clock, the hybrid TOE showed a minimum latency of 17.4 mus and a maximum bandwidth of 736 Mbps. The RDMA mechanism exhibited a minimum latency of 20.6 mus and a maximum bandwidth of 642 Mbps. Most importantly, the hybrid TOE and the TOE-based RDMA mechanism gave CPU utilizations of less than 5.6% and 8.4%, respectively-approximately one-tenth the utilizations when TCP/IP and TCP/IP-based RDMA were processed by the host CPU.
实现了与TCP/IP卸载引擎紧密耦合的高效RDMA机制
我们开发了一种混合TCP/IP卸载引擎(hybrid TOE),该引擎基于FPGA和通用嵌入式处理器,通过硬件/软件协同处理TCP/IP。我们还开发了一种与混合TOE紧密耦合的高效远程直接内存访问(RDMA)机制。在该机制中,混合TOE使用硬件模块进行CRC计算,支持零拷贝数据传输;主机CPU只是生成和处理RDMA协议头。通过使用混合TOE和RDMA机制,计算机系统可以在非常低的CPU利用率下获得良好的网络性能,从而可以预期在整体性能上有很大的提高。在千兆以太网网络上的实验中,尽管嵌入式处理器以300 MHz的核心时钟运行,这是主机cpu时钟速度的七分之一,但混合TOE的最小延迟为17.4 mus,最大带宽为736 Mbps。RDMA机制的最小延迟为20.6 mus,最大带宽为642 Mbps。最重要的是,混合TOE和基于TOE的RDMA机制的CPU利用率分别低于5.6%和8.4%——大约是由主机CPU处理TCP/IP和基于TCP/IP的RDMA时的十分之一。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信