Shin'ichiro Takizawa, Motohiko Matsuda, N. Maruyama, Y. Nakamura
{"title":"面向数据并行工作流的可伸缩多粒度数据模型","authors":"Shin'ichiro Takizawa, Motohiko Matsuda, N. Maruyama, Y. Nakamura","doi":"10.1145/3149457.3154483","DOIUrl":null,"url":null,"abstract":"Scientific applications consist of many tasks and each task has different requirements for the degree of parallelism and data access pattern. To satisfy these requirements, a task scheduling has to assign required number of processes to each task and task's input has to be decomposed and arranged to these processes by considering data access pattern to exploit data locality. However, hand-writing these code is a troublesome and error-prone work. We propose a multi-view data model where users can specify rules of data decomposition for multi-dimensional data to change data layout on top of processes and define unit of parallel processing by simple directives. Our framework conducts data arrangement and affinity-aware task scheduling transparently from users by following the specified rules. Through a case study of a lattice QCD simulation program, we confirmed that our proposal reduced programming efforts against hand-written MPI code with performance penalties up to 17%.","PeriodicalId":314778,"journal":{"name":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region","volume":"93 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Scalable Multi-Granular Data Model for Data Parallel Workflows\",\"authors\":\"Shin'ichiro Takizawa, Motohiko Matsuda, N. Maruyama, Y. Nakamura\",\"doi\":\"10.1145/3149457.3154483\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Scientific applications consist of many tasks and each task has different requirements for the degree of parallelism and data access pattern. To satisfy these requirements, a task scheduling has to assign required number of processes to each task and task's input has to be decomposed and arranged to these processes by considering data access pattern to exploit data locality. However, hand-writing these code is a troublesome and error-prone work. We propose a multi-view data model where users can specify rules of data decomposition for multi-dimensional data to change data layout on top of processes and define unit of parallel processing by simple directives. Our framework conducts data arrangement and affinity-aware task scheduling transparently from users by following the specified rules. Through a case study of a lattice QCD simulation program, we confirmed that our proposal reduced programming efforts against hand-written MPI code with performance penalties up to 17%.\",\"PeriodicalId\":314778,\"journal\":{\"name\":\"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region\",\"volume\":\"93 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-01-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3149457.3154483\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Conference on High Performance Computing in Asia-Pacific Region","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3149457.3154483","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Scalable Multi-Granular Data Model for Data Parallel Workflows
Scientific applications consist of many tasks and each task has different requirements for the degree of parallelism and data access pattern. To satisfy these requirements, a task scheduling has to assign required number of processes to each task and task's input has to be decomposed and arranged to these processes by considering data access pattern to exploit data locality. However, hand-writing these code is a troublesome and error-prone work. We propose a multi-view data model where users can specify rules of data decomposition for multi-dimensional data to change data layout on top of processes and define unit of parallel processing by simple directives. Our framework conducts data arrangement and affinity-aware task scheduling transparently from users by following the specified rules. Through a case study of a lattice QCD simulation program, we confirmed that our proposal reduced programming efforts against hand-written MPI code with performance penalties up to 17%.