Heather Pacella, Alec M. Dunton, A. Doostan, G. Iaccarino
{"title":"Task-parallel in situ temporal compression of large-scale computational fluid dynamics data","authors":"Heather Pacella, Alec M. Dunton, A. Doostan, G. Iaccarino","doi":"10.1177/10943420221085000","DOIUrl":null,"url":null,"abstract":"Present day computational fluid dynamics (CFD) simulations generate considerable amounts of data, sometimes on the order of TB/s. Often, a significant fraction of this data is discarded because current storage systems are unable to keep pace. To address this, data compression algorithms can be applied to data arrays containing flow quantities of interest (QoIs) to reduce the overall required storage. The matrix column interpolative decomposition (ID) can be implemented as a type of lossy compression for data matrices that factors the original data matrix into a product of two smaller factor matrices. One of these matrices consists of a subset of the columns of the original data matrix, while the other is a coefficient matrix which approximates the original data matrix columns as linear combinations of the selected columns. Motivating this work is the observation that the structure of ID algorithms makes them well suited for the asynchronous nature of task-based parallelism; they can operate independently on subdomains of the system of interest and, as a result, provide varied levels of compression. Using the task-based Legion programming model, a single-pass ID algorithm (SPID) for CFD applications is implemented. Performance studies, scalability, and the accuracy of the compression algorithm are presented for a benchmark analytical Taylor-Green vortex problem, as well as large-scale implementations of both low and high Reynolds number (Re) compressible Taylor-Green vortices using a high-order Navier-Stokes solver. In the case of the analytical solution, the resulting compressed solution was rank-one, with error on the order of machine precision. For the low-Re vortex, compression factors between 1000 and 10,000 were achieved for errors in the range 10−2–10−3. Similar error values were seen for the high-Re vortex, this time with compression factors between 100 and 1000. Moreover, strong and weak scaling results demonstrate that introducing SPID to solvers leads to negligible increases in runtime.","PeriodicalId":54957,"journal":{"name":"International Journal of High Performance Computing Applications","volume":"36 1","pages":"388 - 418"},"PeriodicalIF":3.5000,"publicationDate":"2021-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of High Performance Computing Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1177/10943420221085000","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 7
Abstract
Present day computational fluid dynamics (CFD) simulations generate considerable amounts of data, sometimes on the order of TB/s. Often, a significant fraction of this data is discarded because current storage systems are unable to keep pace. To address this, data compression algorithms can be applied to data arrays containing flow quantities of interest (QoIs) to reduce the overall required storage. The matrix column interpolative decomposition (ID) can be implemented as a type of lossy compression for data matrices that factors the original data matrix into a product of two smaller factor matrices. One of these matrices consists of a subset of the columns of the original data matrix, while the other is a coefficient matrix which approximates the original data matrix columns as linear combinations of the selected columns. Motivating this work is the observation that the structure of ID algorithms makes them well suited for the asynchronous nature of task-based parallelism; they can operate independently on subdomains of the system of interest and, as a result, provide varied levels of compression. Using the task-based Legion programming model, a single-pass ID algorithm (SPID) for CFD applications is implemented. Performance studies, scalability, and the accuracy of the compression algorithm are presented for a benchmark analytical Taylor-Green vortex problem, as well as large-scale implementations of both low and high Reynolds number (Re) compressible Taylor-Green vortices using a high-order Navier-Stokes solver. In the case of the analytical solution, the resulting compressed solution was rank-one, with error on the order of machine precision. For the low-Re vortex, compression factors between 1000 and 10,000 were achieved for errors in the range 10−2–10−3. Similar error values were seen for the high-Re vortex, this time with compression factors between 100 and 1000. Moreover, strong and weak scaling results demonstrate that introducing SPID to solvers leads to negligible increases in runtime.
期刊介绍:
With ever increasing pressure for health services in all countries to meet rising demands, improve their quality and efficiency, and to be more accountable; the need for rigorous research and policy analysis has never been greater. The Journal of Health Services Research & Policy presents the latest scientific research, insightful overviews and reflections on underlying issues, and innovative, thought provoking contributions from leading academics and policy-makers. It provides ideas and hope for solving dilemmas that confront all countries.