{"title":"数据并行算法的流水线执行","authors":"M. Gorev, R. Ubar","doi":"10.1109/BEC.2014.7320568","DOIUrl":null,"url":null,"abstract":"A combination of pipelining and data-parallel execution on multiprocessor systems is proposed. The use of pipelining in coarse-grained data-parallel applications can be more advantageous, than the classical data-parallel approach. It is used in order to reduce redundant data transfers for all cores, involved in processing. Class of simulation applications is taken as an example to illustrate principles of the method. It is shown, that overall execution time could be reduced by significant amount of time required to transfer the model data. Set of experiments was carried out using a desktop multicore processor and OpenCL framework for parallel execution. Experimental results show that speedup is achievable even on general-purpose MPSoC platforms.","PeriodicalId":348260,"journal":{"name":"2014 14th Biennial Baltic Electronic Conference (BEC)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Pipelined execution of data-parallel algorithms\",\"authors\":\"M. Gorev, R. Ubar\",\"doi\":\"10.1109/BEC.2014.7320568\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A combination of pipelining and data-parallel execution on multiprocessor systems is proposed. The use of pipelining in coarse-grained data-parallel applications can be more advantageous, than the classical data-parallel approach. It is used in order to reduce redundant data transfers for all cores, involved in processing. Class of simulation applications is taken as an example to illustrate principles of the method. It is shown, that overall execution time could be reduced by significant amount of time required to transfer the model data. Set of experiments was carried out using a desktop multicore processor and OpenCL framework for parallel execution. Experimental results show that speedup is achievable even on general-purpose MPSoC platforms.\",\"PeriodicalId\":348260,\"journal\":{\"name\":\"2014 14th Biennial Baltic Electronic Conference (BEC)\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 14th Biennial Baltic Electronic Conference (BEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BEC.2014.7320568\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 14th Biennial Baltic Electronic Conference (BEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BEC.2014.7320568","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A combination of pipelining and data-parallel execution on multiprocessor systems is proposed. The use of pipelining in coarse-grained data-parallel applications can be more advantageous, than the classical data-parallel approach. It is used in order to reduce redundant data transfers for all cores, involved in processing. Class of simulation applications is taken as an example to illustrate principles of the method. It is shown, that overall execution time could be reduced by significant amount of time required to transfer the model data. Set of experiments was carried out using a desktop multicore processor and OpenCL framework for parallel execution. Experimental results show that speedup is achievable even on general-purpose MPSoC platforms.