{"title":"并行网络编码与SIMD指令集","authors":"Han Li, Huan-yan Qian","doi":"10.1109/ISCSCT.2008.141","DOIUrl":null,"url":null,"abstract":"It is a well known result that network coding may achieve better network throughput in certain multicast topologies. However, the practicality of network coding has been questioned, due to its high computational complexity. This paper represents an attempt towards a high performance implementation of network coding. We first propose to implement progressive decoding with Gauss-Jordan elimination, such that blocks can be decoded as they are received. We then employ hardware acceleration with SIMD vector instructions. We also use a careful threading design to take advantage of symmetric multiprocessor (SMP) systems and multicore processors. Our core idea of optimization is the table-based multiplication in GF(28) ,which is able to process a row multiplication of random linear codes by searching previous built product tables with vector using the SSE3 instruction PSHUFB. Our high performance implementation is encapsulated as a C++ class library. On a dual-core Intel T5500 1.66G PC, the encoding bandwidth of our implementations able to reach 42.493 MB/second with 128 blocks of 4 KB each.","PeriodicalId":228533,"journal":{"name":"2008 International Symposium on Computer Science and Computational Technology","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Parallelized Network Coding with SIMD Instruction Sets\",\"authors\":\"Han Li, Huan-yan Qian\",\"doi\":\"10.1109/ISCSCT.2008.141\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is a well known result that network coding may achieve better network throughput in certain multicast topologies. However, the practicality of network coding has been questioned, due to its high computational complexity. This paper represents an attempt towards a high performance implementation of network coding. We first propose to implement progressive decoding with Gauss-Jordan elimination, such that blocks can be decoded as they are received. We then employ hardware acceleration with SIMD vector instructions. We also use a careful threading design to take advantage of symmetric multiprocessor (SMP) systems and multicore processors. Our core idea of optimization is the table-based multiplication in GF(28) ,which is able to process a row multiplication of random linear codes by searching previous built product tables with vector using the SSE3 instruction PSHUFB. Our high performance implementation is encapsulated as a C++ class library. On a dual-core Intel T5500 1.66G PC, the encoding bandwidth of our implementations able to reach 42.493 MB/second with 128 blocks of 4 KB each.\",\"PeriodicalId\":228533,\"journal\":{\"name\":\"2008 International Symposium on Computer Science and Computational Technology\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-12-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 International Symposium on Computer Science and Computational Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISCSCT.2008.141\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 International Symposium on Computer Science and Computational Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCSCT.2008.141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Parallelized Network Coding with SIMD Instruction Sets
It is a well known result that network coding may achieve better network throughput in certain multicast topologies. However, the practicality of network coding has been questioned, due to its high computational complexity. This paper represents an attempt towards a high performance implementation of network coding. We first propose to implement progressive decoding with Gauss-Jordan elimination, such that blocks can be decoded as they are received. We then employ hardware acceleration with SIMD vector instructions. We also use a careful threading design to take advantage of symmetric multiprocessor (SMP) systems and multicore processors. Our core idea of optimization is the table-based multiplication in GF(28) ,which is able to process a row multiplication of random linear codes by searching previous built product tables with vector using the SSE3 instruction PSHUFB. Our high performance implementation is encapsulated as a C++ class library. On a dual-core Intel T5500 1.66G PC, the encoding bandwidth of our implementations able to reach 42.493 MB/second with 128 blocks of 4 KB each.