{"title":"镶嵌细工的模板","authors":"Liang Yuan, Yunquan Zhang, Peng Guo, Shan Huang","doi":"10.1145/3126908.3126920","DOIUrl":null,"url":null,"abstract":"Stencil computations represent a very common class of nested loops in scientific and engineering applications. The exhaustively studied tiling is one of the most powerful transformation techniques to explore the data locality and parallelism. Unlike previous work, which mostly blocks the iteration space of a stencil directly, this paper proposes a novel two-level tessellation scheme. A set of blocks are designed to tessellate the spatial space in various ways. The blocks can be processed in parallel without redundant computation. This corresponds to extending them along the time dimension and can form a tessellation of the iteration space. Experimental results show that our code performs up to 12% better than the existing highly concurrent schemes for the 3d27p stencil.","PeriodicalId":204241,"journal":{"name":"SC17: International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Tessellating Stencils\",\"authors\":\"Liang Yuan, Yunquan Zhang, Peng Guo, Shan Huang\",\"doi\":\"10.1145/3126908.3126920\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stencil computations represent a very common class of nested loops in scientific and engineering applications. The exhaustively studied tiling is one of the most powerful transformation techniques to explore the data locality and parallelism. Unlike previous work, which mostly blocks the iteration space of a stencil directly, this paper proposes a novel two-level tessellation scheme. A set of blocks are designed to tessellate the spatial space in various ways. The blocks can be processed in parallel without redundant computation. This corresponds to extending them along the time dimension and can form a tessellation of the iteration space. Experimental results show that our code performs up to 12% better than the existing highly concurrent schemes for the 3d27p stencil.\",\"PeriodicalId\":204241,\"journal\":{\"name\":\"SC17: International Conference for High Performance Computing, Networking, Storage and Analysis\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SC17: International Conference for High Performance Computing, Networking, Storage and Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3126908.3126920\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SC17: International Conference for High Performance Computing, Networking, Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3126908.3126920","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Stencil computations represent a very common class of nested loops in scientific and engineering applications. The exhaustively studied tiling is one of the most powerful transformation techniques to explore the data locality and parallelism. Unlike previous work, which mostly blocks the iteration space of a stencil directly, this paper proposes a novel two-level tessellation scheme. A set of blocks are designed to tessellate the spatial space in various ways. The blocks can be processed in parallel without redundant computation. This corresponds to extending them along the time dimension and can form a tessellation of the iteration space. Experimental results show that our code performs up to 12% better than the existing highly concurrent schemes for the 3d27p stencil.