{"title":"一种高效的域可编程并行LDPCC迭代译码结构","authors":"G. Al-Rawi, J. Cioffi","doi":"10.1109/ITCC.2001.918858","DOIUrl":null,"url":null,"abstract":"We present a domain-programmable (code-independent) parallel architecture for efficiently implementing iterative probabilistic decoding of LDPC codes. The architecture is based on distributed computing and message passing. The exploited parallelism was found to be communication limited. To increase the utilization of the computational resources, we separate the routing process and state management functionalities performed by physical nodes from computation functionalities performed by function units that can be shared by multiple physical nodes. Simulation results show that the proposed architecture leads to improvements in FU utilization by 251%, 116%, and 209% compared to a hypothetical fully parallel custom implementation, a fully sequential implementation, and a proprietary FPGA custom implementation, respectively, that all use the same core FU design. Compared to an implementation on a shared-memory general-purpose parallel machine, the proposed architecture exhibits 75.6% improvement in scalability. We also introduce a novel low cost store-and-forward routing algorithm for deadlock avoidance in torus networks.","PeriodicalId":318295,"journal":{"name":"Proceedings International Conference on Information Technology: Coding and Computing","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"A highly efficient domain-programmable parallel architecture for iterative LDPCC decoding\",\"authors\":\"G. Al-Rawi, J. Cioffi\",\"doi\":\"10.1109/ITCC.2001.918858\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present a domain-programmable (code-independent) parallel architecture for efficiently implementing iterative probabilistic decoding of LDPC codes. The architecture is based on distributed computing and message passing. The exploited parallelism was found to be communication limited. To increase the utilization of the computational resources, we separate the routing process and state management functionalities performed by physical nodes from computation functionalities performed by function units that can be shared by multiple physical nodes. Simulation results show that the proposed architecture leads to improvements in FU utilization by 251%, 116%, and 209% compared to a hypothetical fully parallel custom implementation, a fully sequential implementation, and a proprietary FPGA custom implementation, respectively, that all use the same core FU design. Compared to an implementation on a shared-memory general-purpose parallel machine, the proposed architecture exhibits 75.6% improvement in scalability. We also introduce a novel low cost store-and-forward routing algorithm for deadlock avoidance in torus networks.\",\"PeriodicalId\":318295,\"journal\":{\"name\":\"Proceedings International Conference on Information Technology: Coding and Computing\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2001-04-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings International Conference on Information Technology: Coding and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITCC.2001.918858\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings International Conference on Information Technology: Coding and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITCC.2001.918858","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A highly efficient domain-programmable parallel architecture for iterative LDPCC decoding
We present a domain-programmable (code-independent) parallel architecture for efficiently implementing iterative probabilistic decoding of LDPC codes. The architecture is based on distributed computing and message passing. The exploited parallelism was found to be communication limited. To increase the utilization of the computational resources, we separate the routing process and state management functionalities performed by physical nodes from computation functionalities performed by function units that can be shared by multiple physical nodes. Simulation results show that the proposed architecture leads to improvements in FU utilization by 251%, 116%, and 209% compared to a hypothetical fully parallel custom implementation, a fully sequential implementation, and a proprietary FPGA custom implementation, respectively, that all use the same core FU design. Compared to an implementation on a shared-memory general-purpose parallel machine, the proposed architecture exhibits 75.6% improvement in scalability. We also introduce a novel low cost store-and-forward routing algorithm for deadlock avoidance in torus networks.