{"title":"一个软件/硬件并行统一随机数生成框架","authors":"Yuan Li, Minxuan Zhang","doi":"10.1109/iccsn.2018.8488283","DOIUrl":null,"url":null,"abstract":"In this paper, a software/hardware framework is proposed for generating uniform random numbers in parallel. Using the Fast Jump Ahead technique, the software can produce initial states for each generator to guarantee independence of different sub-streams. With support from the software, the hardware structure can be easily constructed by simply replicating the single generator. We apply the framework to parallelize MT19937 algorithm. Experimental results shows that our framework is capable of generating arbitrary number of independent parallel random sequences while obtaining speedup roughly proportional to the number of parallel cores. Meanwhile, our framework is superior to those existing architectures reported in the literatures in both throughput rate and scalability. Furthermore, we implement 149 parallel instances of MT19937 generators on a Xilinx Virtex-5 FPGA device. It achieves the throughput of 42.61M samples/s. Compared to CPU and GPU implementations, the throughput is 10.0 and 2.5 times faster, while the throughputpower efficiency achieves 167.3 and 18.1 times speedup, respectively.","PeriodicalId":243383,"journal":{"name":"2018 10th International Conference on Communication Software and Networks (ICCSN)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Software/Hardware Parallel Uniform Random Number Generation Framework\",\"authors\":\"Yuan Li, Minxuan Zhang\",\"doi\":\"10.1109/iccsn.2018.8488283\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, a software/hardware framework is proposed for generating uniform random numbers in parallel. Using the Fast Jump Ahead technique, the software can produce initial states for each generator to guarantee independence of different sub-streams. With support from the software, the hardware structure can be easily constructed by simply replicating the single generator. We apply the framework to parallelize MT19937 algorithm. Experimental results shows that our framework is capable of generating arbitrary number of independent parallel random sequences while obtaining speedup roughly proportional to the number of parallel cores. Meanwhile, our framework is superior to those existing architectures reported in the literatures in both throughput rate and scalability. Furthermore, we implement 149 parallel instances of MT19937 generators on a Xilinx Virtex-5 FPGA device. It achieves the throughput of 42.61M samples/s. Compared to CPU and GPU implementations, the throughput is 10.0 and 2.5 times faster, while the throughputpower efficiency achieves 167.3 and 18.1 times speedup, respectively.\",\"PeriodicalId\":243383,\"journal\":{\"name\":\"2018 10th International Conference on Communication Software and Networks (ICCSN)\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 10th International Conference on Communication Software and Networks (ICCSN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/iccsn.2018.8488283\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 10th International Conference on Communication Software and Networks (ICCSN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iccsn.2018.8488283","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Software/Hardware Parallel Uniform Random Number Generation Framework
In this paper, a software/hardware framework is proposed for generating uniform random numbers in parallel. Using the Fast Jump Ahead technique, the software can produce initial states for each generator to guarantee independence of different sub-streams. With support from the software, the hardware structure can be easily constructed by simply replicating the single generator. We apply the framework to parallelize MT19937 algorithm. Experimental results shows that our framework is capable of generating arbitrary number of independent parallel random sequences while obtaining speedup roughly proportional to the number of parallel cores. Meanwhile, our framework is superior to those existing architectures reported in the literatures in both throughput rate and scalability. Furthermore, we implement 149 parallel instances of MT19937 generators on a Xilinx Virtex-5 FPGA device. It achieves the throughput of 42.61M samples/s. Compared to CPU and GPU implementations, the throughput is 10.0 and 2.5 times faster, while the throughputpower efficiency achieves 167.3 and 18.1 times speedup, respectively.