Frontiers in High Performance Computing最新文献

筛选
英文 中文
Supercharging distributed computing environments for high-performance data engineering 为高性能数据工程的分布式计算环境增压
Frontiers in High Performance Computing Pub Date : 2024-07-12 DOI: 10.3389/fhpcp.2024.1384619
Niranda Perera, A. Sarker, Kaiying Shan, Alex Fetea, Supun Kamburugamuve, Thejaka Amila Kanewala, Chathura Widanage, Mills Staylor, Tianle Zhong, V. Abeykoon, Gregor von Laszewski, Geoffrey Fox
{"title":"Supercharging distributed computing environments for high-performance data engineering","authors":"Niranda Perera, A. Sarker, Kaiying Shan, Alex Fetea, Supun Kamburugamuve, Thejaka Amila Kanewala, Chathura Widanage, Mills Staylor, Tianle Zhong, V. Abeykoon, Gregor von Laszewski, Geoffrey Fox","doi":"10.3389/fhpcp.2024.1384619","DOIUrl":"https://doi.org/10.3389/fhpcp.2024.1384619","url":null,"abstract":"The data engineering and data science community has embraced the idea of using Python and R dataframes for regular applications. Driven by the big data revolution and artificial intelligence, these frameworks are now ever more important in order to process terabytes of data. They can easily exceed the capabilities of a single machine but also demand significant developer time and effort due to their convenience and ability to manipulate data with high-level abstractions that can be optimized. Therefore it is essential to design scalable dataframe solutions. There have been multiple efforts to be integrated into the most efficient fashion to tackle this problem, the most notable being the dataframe systems developed using distributed computing environments such as Dask and Ray. Even though Dask and Ray's distributed computing features look very promising, we perceive that the Dask Dataframes and Ray Datasets still have room for optimization In this paper, we present CylonFlow, an alternative distributed dataframe execution methodology that enables state-of-the-art performance and scalability on the same Dask and Ray infrastructure (supercharging them!). To achieve this, we integrate a high-performance dataframe system Cylon, which was originally based on an entirely different execution paradigm, into Dask and Ray. Our experiments show that on a pipeline of dataframe operators, CylonFlow achieves 30 × more distributed performance than Dask Dataframes. Interestingly, it also enables superior sequential performance due to leveraging the native C++ execution of Cylon. We believe the performance of Cylon in conjunction with CylonFlow extends beyond the data engineering domain and can be used to consolidate high-performance computing and distributed computing ecosystems.","PeriodicalId":474805,"journal":{"name":"Frontiers in High Performance Computing","volume":"40 18","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141655028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multiphysics coupling framework for exascale simulation of fracture evolution in subsurface energy applications 用于超大规模模拟地下能源应用中断裂演化的多物理场耦合框架
Frontiers in High Performance Computing Pub Date : 2024-07-01 DOI: 10.3389/fhpcp.2024.1416727
David Trebotich, R. Settgast, Terry Ligocki, William Tobin, Gregory H. Miller, Sergi Molins, C. Steefel
{"title":"A multiphysics coupling framework for exascale simulation of fracture evolution in subsurface energy applications","authors":"David Trebotich, R. Settgast, Terry Ligocki, William Tobin, Gregory H. Miller, Sergi Molins, C. Steefel","doi":"10.3389/fhpcp.2024.1416727","DOIUrl":"https://doi.org/10.3389/fhpcp.2024.1416727","url":null,"abstract":"Predicting the evolution of fractured media is challenging due to coupled thermal, hydrological, chemical and mechanical processes that occur over a broad range of spatial scales, from the microscopic pore scale to field scale. We present a software framework and scientific workflow that couples the pore scale flow and reactive transport simulator Chombo-Crunch with the field scale geomechanics solver in GEOS to simulate fracture evolution in subsurface fluid-rock systems. This new multiphysics coupling capability comprises several novel features. An HDF5 data schema for coupling fracture positions between the two codes is employed and leverages the coarse resolution of the GEOS mechanics solver which limits the size of data coupled, and is, thus, not taxed by data resulting from the high resolution pore scale Chombo-Crunch solver. The coupling framework requires tracking of both before and after coarse nodal positions in GEOS as well as the resolved embedded boundary in Chombo-Crunch. We accomplished this by developing an approach to geometry generation that tracks the fracture interface between the two different methodologies. The GEOS quadrilateral mesh is converted to triangles which are organized into bins and an accessible tree structure; the nodes are then mapped to the Chombo representation using a continuous signed distance function that determines locations inside, on and outside of the fracture boundary. The GEOS positions are retained in memory on the Chombo-Crunch side of the coupling. The time stepping cadence for coupled multiphysics processes of flow, transport, reactions and mechanics is stable and demonstrates temporal reach to experimental time scales. The approach is validated by demonstration of 9 days of simulated time of a core flood experiment with fracture aperture evolution due to invasion of carbonated brine in wellbore-cement and sandstone. We also demonstrate usage of exascale computing resources by simulating a high resolution version of the validation problem on OLCF Frontier.","PeriodicalId":474805,"journal":{"name":"Frontiers in High Performance Computing","volume":"116 3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141697218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SmartORC: smart orchestration of resources in the compute continuum SmartORC:计算连续体中资源的智能编排
Frontiers in High Performance Computing Pub Date : 2023-10-25 DOI: 10.3389/fhpcp.2023.1164915
Emanuele Carlini, Massimo Coppola, Patrizio Dazzi, Luca Ferrucci, Hanna Kavalionak, Ioannis Korontanis, Matteo Mordacchini, Konstantinos Tserpes
{"title":"SmartORC: smart orchestration of resources in the compute continuum","authors":"Emanuele Carlini, Massimo Coppola, Patrizio Dazzi, Luca Ferrucci, Hanna Kavalionak, Ioannis Korontanis, Matteo Mordacchini, Konstantinos Tserpes","doi":"10.3389/fhpcp.2023.1164915","DOIUrl":"https://doi.org/10.3389/fhpcp.2023.1164915","url":null,"abstract":"The promise of the compute continuum is to present applications with a flexible and transparent view of the resources in the Internet of Things–Edge–Cloud ecosystem. However, such a promise requires tackling complex challenges to maximize the benefits of both the cloud and the edge. Challenges include managing a highly distributed platform, matching services and resources, harnessing resource heterogeneity, and adapting the deployment of services to the changes in resources and applications. In this study, we present SmartORC, a comprehensive set of components designed to provide a complete framework for managing resources and applications in the Compute Continuum. Along with the description of all the SmartORC subcomponents, we have also provided the results of an evaluation aimed at showcasing the framework's capability.","PeriodicalId":474805,"journal":{"name":"Frontiers in High Performance Computing","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135169475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Opportunities for enhancing MLCommons efforts while leveraging insights from educational MLCommons earthquake benchmarks efforts 加强MLCommons工作的机会,同时利用教育MLCommons地震基准工作的见解
Frontiers in High Performance Computing Pub Date : 2023-10-23 DOI: 10.3389/fhpcp.2023.1233877
Gregor von Laszewski, J. P. Fleischer, Robert Knuuti, Geoffrey C. Fox, Jake Kolessar, Thomas S. Butler, Judy Fox
{"title":"Opportunities for enhancing MLCommons efforts while leveraging insights from educational MLCommons earthquake benchmarks efforts","authors":"Gregor von Laszewski, J. P. Fleischer, Robert Knuuti, Geoffrey C. Fox, Jake Kolessar, Thomas S. Butler, Judy Fox","doi":"10.3389/fhpcp.2023.1233877","DOIUrl":"https://doi.org/10.3389/fhpcp.2023.1233877","url":null,"abstract":"MLCommons is an effort to develop and improve the artificial intelligence (AI) ecosystem through benchmarks, public data sets, and research. It consists of members from start-ups, leading companies, academics, and non-profits from around the world. The goal is to make machine learning better for everyone. In order to increase participation by others, educational institutions provide valuable opportunities for engagement. In this article, we identify numerous insights obtained from different viewpoints as part of efforts to utilize high-performance computing (HPC) big data systems in existing education while developing and conducting science benchmarks for earthquake prediction. As this activity was conducted across multiple educational efforts, we project if and how it is possible to make such efforts available on a wider scale. This includes the integration of sophisticated benchmarks into courses and research activities at universities, exposing the students and researchers to topics that are otherwise typically not sufficiently covered in current course curricula as we witnessed from our practical experience across multiple organizations. As such, we have outlined the many lessons we learned throughout these efforts, culminating in the need for benchmark carpentry for scientists using advanced computational resources. The article also presents the analysis of an earthquake prediction code benchmark while focusing on the accuracy of the results and not only on the runtime; notedly, this benchmark was created as a result of our lessons learned. Energy traces were produced throughout these benchmarks, which are vital to analyzing the power expenditure within HPC environments. Additionally, one of the insights is that in the short time of the project with limited student availability, the activity was only possible by utilizing a benchmark runtime pipeline while developing and using software to generate jobs from the permutation of hyperparameters automatically. It integrates a templated job management framework for executing tasks and experiments based on hyperparameters while leveraging hybrid compute resources available at different institutions. The software is part of a collection called cloudmesh with its newly developed components, cloudmesh-ee (experiment executor) and cloudmesh-cc (compute coordinator).","PeriodicalId":474805,"journal":{"name":"Frontiers in High Performance Computing","volume":"314 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135411836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信