Anna Giannakou, Johannes P. Blaschke, Deborah Bard, L. Ramakrishnan
{"title":"Experiences with Cross-Facility Real-Time Light Source Data Analysis Workflows","authors":"Anna Giannakou, Johannes P. Blaschke, Deborah Bard, L. Ramakrishnan","doi":"10.1109/UrgentHPC54802.2021.00011","DOIUrl":null,"url":null,"abstract":"We are seeing a growth in scientific data from experimental and observational facilities that are resulting in significant new computational patterns and needs. For example, scientists running experiments at light sources, often analyses workflows require near real-time access to compute resources in order to obtain results used for re-configuring on-going experiments. These workflows often have requirements that are different from the traditional large-scale parallel applications that have traditionally run at HPC centers. In this paper, we present our experiences supporting two light source data analysis workflows that run on HPC resources at National Energy Research Scientific Computing Center. We discuss the characteristics of workflows, runtime requirements and associated execution challenges when running on HPC environments. We present a discussion and a summary of best practices that address execution challenges and current and future solutions for leveraging HPC resources for near real-time data analysis.","PeriodicalId":360682,"journal":{"name":"2021 IEEE/ACM HPC for Urgent Decision Making (UrgentHPC)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACM HPC for Urgent Decision Making (UrgentHPC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UrgentHPC54802.2021.00011","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
We are seeing a growth in scientific data from experimental and observational facilities that are resulting in significant new computational patterns and needs. For example, scientists running experiments at light sources, often analyses workflows require near real-time access to compute resources in order to obtain results used for re-configuring on-going experiments. These workflows often have requirements that are different from the traditional large-scale parallel applications that have traditionally run at HPC centers. In this paper, we present our experiences supporting two light source data analysis workflows that run on HPC resources at National Energy Research Scientific Computing Center. We discuss the characteristics of workflows, runtime requirements and associated execution challenges when running on HPC environments. We present a discussion and a summary of best practices that address execution challenges and current and future solutions for leveraging HPC resources for near real-time data analysis.