Andra Hugo, A. Guermouche, Pierre-André Wacrenier, R. Namyst
{"title":"A Runtime Approach to Dynamic Resource Allocation for Sparse Direct Solvers","authors":"Andra Hugo, A. Guermouche, Pierre-André Wacrenier, R. Namyst","doi":"10.1109/ICPP.2014.57","DOIUrl":null,"url":null,"abstract":"To face the advent of multicore processors and the ever increasing complexity of hardware architectures, programming models based on DAG-of-tasks parallelism regained popularity in the high performance, scientific computing community. In this context, enabling HPC applications to perform efficiently when dealing with graphs of parallel tasks that could potentially run simultaneously is a great challenge. Even if a uniform runtime system is used underneath, scheduling multiple parallel tasks over the same set of hardware resources introduces many issues, such as undesirable cache flushes or memory bus contention. In this paper, we show how runtime system-based scheduling contexts can be used to dynamically enforce locality of parallel tasks on multicore machines. We extend an existing generic sparse direct solver to use our mechanism and introduce a new decomposition method based on proportional mapping that is used to build the scheduling contexts. We propose a runtime-level dynamic context management policy to cope with the very irregular behaviour of the application. A detailed performance analysis shows significant performance improvements of the solver over various multicore hardware.","PeriodicalId":441115,"journal":{"name":"2014 43rd International Conference on Parallel Processing","volume":"66 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 43rd International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2014.57","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
To face the advent of multicore processors and the ever increasing complexity of hardware architectures, programming models based on DAG-of-tasks parallelism regained popularity in the high performance, scientific computing community. In this context, enabling HPC applications to perform efficiently when dealing with graphs of parallel tasks that could potentially run simultaneously is a great challenge. Even if a uniform runtime system is used underneath, scheduling multiple parallel tasks over the same set of hardware resources introduces many issues, such as undesirable cache flushes or memory bus contention. In this paper, we show how runtime system-based scheduling contexts can be used to dynamically enforce locality of parallel tasks on multicore machines. We extend an existing generic sparse direct solver to use our mechanism and introduce a new decomposition method based on proportional mapping that is used to build the scheduling contexts. We propose a runtime-level dynamic context management policy to cope with the very irregular behaviour of the application. A detailed performance analysis shows significant performance improvements of the solver over various multicore hardware.