用于计算McCaskill分区函数的并行缓存效率代码

2019 Federated Conference on Computer Science and Information Systems (FedCSIS) Pub Date : 2019-09-26 DOI:10.15439/2019F8

M. Pałkowski, W. Bielecki

{"title":"用于计算McCaskill分区函数的并行缓存效率代码","authors":"M. Pałkowski, W. Bielecki","doi":"10.15439/2019F8","DOIUrl":null,"url":null,"abstract":"We present parallel tiled optimized McCaskill’s partition functions computation code. That CPU and memory intensive dynamic programming task is within computational biology. To optimize code, we use the authorial source-to-source TRACO compiler and compare obtained code performance to that generated with the state-of-the-art PluTo compiler based on the affine transformations framework (ATF). Although PLuTo generates tiled code with outstanding locality, it fails to parallelize tiled code. A TRACO tiling strategy uses the transitive closure of a dependence graph to avoid affine function calculation. The ISL scheduler is used to parallelize tiled loop nests. An experimental study carried out on a multi-core computer demonstrates considerable speed-up of generated code for the larger number of threads.","PeriodicalId":168208,"journal":{"name":"2019 Federated Conference on Computer Science and Information Systems (FedCSIS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Parallel cache-efficient code for computing the McCaskill partition functions\",\"authors\":\"M. Pałkowski, W. Bielecki\",\"doi\":\"10.15439/2019F8\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present parallel tiled optimized McCaskill’s partition functions computation code. That CPU and memory intensive dynamic programming task is within computational biology. To optimize code, we use the authorial source-to-source TRACO compiler and compare obtained code performance to that generated with the state-of-the-art PluTo compiler based on the affine transformations framework (ATF). Although PLuTo generates tiled code with outstanding locality, it fails to parallelize tiled code. A TRACO tiling strategy uses the transitive closure of a dependence graph to avoid affine function calculation. The ISL scheduler is used to parallelize tiled loop nests. An experimental study carried out on a multi-core computer demonstrates considerable speed-up of generated code for the larger number of threads.\",\"PeriodicalId\":168208,\"journal\":{\"name\":\"2019 Federated Conference on Computer Science and Information Systems (FedCSIS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 Federated Conference on Computer Science and Information Systems (FedCSIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.15439/2019F8\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Federated Conference on Computer Science and Information Systems (FedCSIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15439/2019F8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

给出了并行平铺优化的McCaskill配分函数计算代码。这种CPU和内存密集型的动态规划任务属于计算生物学。为了优化代码，我们使用原始的源到源TRACO编译器，并将获得的代码性能与基于仿射转换框架(ATF)的最先进的PluTo编译器生成的代码性能进行比较。尽管PLuTo生成的平铺代码具有出色的局部性，但它无法并行化平铺代码。TRACO平铺策略使用依赖图的传递闭包来避免仿射函数的计算。ISL调度器用于并行化平铺循环巢。在多核计算机上进行的一项实验研究表明，在线程数量较多的情况下，生成代码的速度有相当大的提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Parallel cache-efficient code for computing the McCaskill partition functions

We present parallel tiled optimized McCaskill’s partition functions computation code. That CPU and memory intensive dynamic programming task is within computational biology. To optimize code, we use the authorial source-to-source TRACO compiler and compare obtained code performance to that generated with the state-of-the-art PluTo compiler based on the affine transformations framework (ATF). Although PLuTo generates tiled code with outstanding locality, it fails to parallelize tiled code. A TRACO tiling strategy uses the transitive closure of a dependence graph to avoid affine function calculation. The ISL scheduler is used to parallelize tiled loop nests. An experimental study carried out on a multi-core computer demonstrates considerable speed-up of generated code for the larger number of threads.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 Federated Conference on Computer Science and Information Systems (FedCSIS)

自引率

0.00%

发文量