Marcos Paulo Rocha, F. França, A. S. Nery, Leandro S. Guedes
{"title":"GPGPU流处理的优化数据流引擎","authors":"Marcos Paulo Rocha, F. França, A. S. Nery, Leandro S. Guedes","doi":"10.1504/IJGUC.2019.099689","DOIUrl":null,"url":null,"abstract":"Stream processing applications have high-demanding performance requirements that are hard to tackle using traditional parallel models on modern many-core architectures, such as GPUs. On the other hand, recent dataflow computing models can naturally expose and facilitate the parallelism exploitation for a wide class of applications. Thus, instead of following the program order, different operations can be run in parallel as soon as their input operands become available. This work presents an extension to an existing dataflow library for Java. The library extension implements high-level constructs with multiple command queues to enable the superposition of memory operations and kernel executions on GPUs. Experimental results show that significant speedup can be achieved for a subset of well-known stream processing applications: Volume Ray-Casting, Path-Tracing and Sobel Filter. Moreover, new contributions in respect to concurrency analysis and the Stream processing parallel model in dataflow are presented.","PeriodicalId":375871,"journal":{"name":"Int. J. Grid Util. Comput.","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An optimised dataflow engine for GPGPU stream processing\",\"authors\":\"Marcos Paulo Rocha, F. França, A. S. Nery, Leandro S. Guedes\",\"doi\":\"10.1504/IJGUC.2019.099689\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stream processing applications have high-demanding performance requirements that are hard to tackle using traditional parallel models on modern many-core architectures, such as GPUs. On the other hand, recent dataflow computing models can naturally expose and facilitate the parallelism exploitation for a wide class of applications. Thus, instead of following the program order, different operations can be run in parallel as soon as their input operands become available. This work presents an extension to an existing dataflow library for Java. The library extension implements high-level constructs with multiple command queues to enable the superposition of memory operations and kernel executions on GPUs. Experimental results show that significant speedup can be achieved for a subset of well-known stream processing applications: Volume Ray-Casting, Path-Tracing and Sobel Filter. Moreover, new contributions in respect to concurrency analysis and the Stream processing parallel model in dataflow are presented.\",\"PeriodicalId\":375871,\"journal\":{\"name\":\"Int. J. Grid Util. Comput.\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Grid Util. Comput.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/IJGUC.2019.099689\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Grid Util. Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJGUC.2019.099689","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An optimised dataflow engine for GPGPU stream processing
Stream processing applications have high-demanding performance requirements that are hard to tackle using traditional parallel models on modern many-core architectures, such as GPUs. On the other hand, recent dataflow computing models can naturally expose and facilitate the parallelism exploitation for a wide class of applications. Thus, instead of following the program order, different operations can be run in parallel as soon as their input operands become available. This work presents an extension to an existing dataflow library for Java. The library extension implements high-level constructs with multiple command queues to enable the superposition of memory operations and kernel executions on GPUs. Experimental results show that significant speedup can be achieved for a subset of well-known stream processing applications: Volume Ray-Casting, Path-Tracing and Sobel Filter. Moreover, new contributions in respect to concurrency analysis and the Stream processing parallel model in dataflow are presented.