Xiaodong Yi, Junjie Wang, Jingpu Duan, Wei Bai, Chuan Wu, Y. Xiong, Dongsu Han
{"title":"FlowShader: gpu加速VNF流处理的通用框架","authors":"Xiaodong Yi, Junjie Wang, Jingpu Duan, Wei Bai, Chuan Wu, Y. Xiong, Dongsu Han","doi":"10.1109/ICNP.2019.8888129","DOIUrl":null,"url":null,"abstract":"GPU acceleration has been widely investigated for packet processing in virtual network functions (NFs), but not for L7 flow-processing NFs. In L7 NFs, reassembled TCP messages of the same flow should be processed in order in the same processing thread, and the uneven sizes among flows pose a major challenge for full realization of GPU’s parallel computation power.To exploit GPUs for L7 NF processing, this paper presents FlowShader, a GPU acceleration framework to achieve both high generality and throughput even under skewed flow size distributions. We carefully design an efficient scheduling algorithm that fully exploits available GPU and CPU capacities; in particular, we dispatch large flows which seriously break up the size balance to CPU and the rest of flows to GPU. Furthermore, FlowShader allows similar NF logic (as CPU-based NFs) to run on individual threads in a GPU, which is more generalized and easy to take on as compared to redesigning an NF for operation parallelism on GPU. We implemented a number of L7 flow processing NFs based on FlowShader. Evaluations are conducted under both synthetic and real-world traffic traces and results show that the throughput achieved by FlowShader is up to 6x that of the CPU-only baseline and 3x of the GPU-only design.","PeriodicalId":385397,"journal":{"name":"2019 IEEE 27th International Conference on Network Protocols (ICNP)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"FlowShader: a Generalized Framework for GPU-accelerated VNF Flow Processing\",\"authors\":\"Xiaodong Yi, Junjie Wang, Jingpu Duan, Wei Bai, Chuan Wu, Y. Xiong, Dongsu Han\",\"doi\":\"10.1109/ICNP.2019.8888129\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"GPU acceleration has been widely investigated for packet processing in virtual network functions (NFs), but not for L7 flow-processing NFs. In L7 NFs, reassembled TCP messages of the same flow should be processed in order in the same processing thread, and the uneven sizes among flows pose a major challenge for full realization of GPU’s parallel computation power.To exploit GPUs for L7 NF processing, this paper presents FlowShader, a GPU acceleration framework to achieve both high generality and throughput even under skewed flow size distributions. We carefully design an efficient scheduling algorithm that fully exploits available GPU and CPU capacities; in particular, we dispatch large flows which seriously break up the size balance to CPU and the rest of flows to GPU. Furthermore, FlowShader allows similar NF logic (as CPU-based NFs) to run on individual threads in a GPU, which is more generalized and easy to take on as compared to redesigning an NF for operation parallelism on GPU. We implemented a number of L7 flow processing NFs based on FlowShader. Evaluations are conducted under both synthetic and real-world traffic traces and results show that the throughput achieved by FlowShader is up to 6x that of the CPU-only baseline and 3x of the GPU-only design.\",\"PeriodicalId\":385397,\"journal\":{\"name\":\"2019 IEEE 27th International Conference on Network Protocols (ICNP)\",\"volume\":\"117 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 27th International Conference on Network Protocols (ICNP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNP.2019.8888129\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 27th International Conference on Network Protocols (ICNP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNP.2019.8888129","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
FlowShader: a Generalized Framework for GPU-accelerated VNF Flow Processing
GPU acceleration has been widely investigated for packet processing in virtual network functions (NFs), but not for L7 flow-processing NFs. In L7 NFs, reassembled TCP messages of the same flow should be processed in order in the same processing thread, and the uneven sizes among flows pose a major challenge for full realization of GPU’s parallel computation power.To exploit GPUs for L7 NF processing, this paper presents FlowShader, a GPU acceleration framework to achieve both high generality and throughput even under skewed flow size distributions. We carefully design an efficient scheduling algorithm that fully exploits available GPU and CPU capacities; in particular, we dispatch large flows which seriously break up the size balance to CPU and the rest of flows to GPU. Furthermore, FlowShader allows similar NF logic (as CPU-based NFs) to run on individual threads in a GPU, which is more generalized and easy to take on as compared to redesigning an NF for operation parallelism on GPU. We implemented a number of L7 flow processing NFs based on FlowShader. Evaluations are conducted under both synthetic and real-world traffic traces and results show that the throughput achieved by FlowShader is up to 6x that of the CPU-only baseline and 3x of the GPU-only design.