Shin'ichiro Takizawa, M. Shimizu, H. Nakada, Hiroya Matsuba, Ryousei Takano
{"title":"CloudQ: A Secure AI / HPC Cloud Bursting System","authors":"Shin'ichiro Takizawa, M. Shimizu, H. Nakada, Hiroya Matsuba, Ryousei Takano","doi":"10.1109/HUST56722.2022.00012","DOIUrl":"https://doi.org/10.1109/HUST56722.2022.00012","url":null,"abstract":"As a method to optimize the investment for computational resources, cloud bursting is collecting a lot of attention, where the organizations utilize the cloud computing environment in on-demand fashion, while preserving the minimum amount of on-premise resources for sensitive data processing. For the practical cloud bursting, we need to achieve 1) secure job / data sharing, 2) uniform job execution environment for on-premise and cloud, and 3) on-demand automatic deployment of the execution environment on the cloud. To enable these items, we propose a meta-scheduling system called CloudQ. CloudQ 1) uses cloud object storage for data sharing, 2) utilizes container images to provide uniform job execution environment, and 3) automatically deploys an execution environment on the cloud.","PeriodicalId":308756,"journal":{"name":"2022 IEEE/ACM International Workshop on HPC User Support Tools (HUST)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121366187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
R. Tracey, Mobayode O. Akinsolu, V. Elisseev, Sultan Shoaib
{"title":"pyp2pcluster: A cluster discovery tool","authors":"R. Tracey, Mobayode O. Akinsolu, V. Elisseev, Sultan Shoaib","doi":"10.1109/HUST56722.2022.00007","DOIUrl":"https://doi.org/10.1109/HUST56722.2022.00007","url":null,"abstract":"It is becoming increasingly common for laboratories and universities to share computing resources. Also as cloud usage and applications continue to expand, a hybrid cloud working model is fast becoming a common standard practice. In line with these present-day trends, we present in this paper an open-source Python library that provides information on high performance computing (HPC) clusters and systems that are available to a user via a peer to peer (P2P) infrastructure. These metrics include the size of system and availability of nodes, along with the speed of connection between clusters. We will present the benefits of using a P2P model compared to traditional client server models and look at the ease in which this can be implemented. We will also look at the benefits and uses of gathering this data in one location in order to assist with the managing of complex workloads in heterogeneous environments.","PeriodicalId":308756,"journal":{"name":"2022 IEEE/ACM International Workshop on HPC User Support Tools (HUST)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132387924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yucheng Zhang, Lev Gorenstein, Payas Bhutra, Ryan T. DeRue
{"title":"Containerized Bioinformatics Ecosystem for HPC","authors":"Yucheng Zhang, Lev Gorenstein, Payas Bhutra, Ryan T. DeRue","doi":"10.1109/HUST56722.2022.00006","DOIUrl":"https://doi.org/10.1109/HUST56722.2022.00006","url":null,"abstract":"Container technologies such as Docker and SingularityCE wrap the application together with everything it needs to run into an isolated environment. This enables containerized applications to always run the same regardless of the environment in which they are running, which positions container technology as a critical tool for data reproducibility in science. In high-performance computing (HPC) environments, SingularityCE has been widely used, and the primary reason for its popularity is that it can significantly reduce system administrators' work of deploying applications. One such domain where we see potential for this technology is in the deployment of bioinformatics applications. Bioinformatics is an interdisciplinary scientific field combining biology, chemistry, computer science, mathematics, statistics, and other areas of science. Traditionally, HPC system administrators may need thousands of hours to compile, install, and deploy a broad stack of bioinformatics applications for users. HPC-friendly container technologies have the potential to transform traditional methods of installing and managing applications. This paper introduces how our HPC center used SingularityCE to provide over 600 containerized bioinformatics applications that were tested by staff with expertise in bioinformatics, on 6 campus production systems as well as ACCESS Anvil. This paper will also explore how, leveraging Lmod, containerization was made transparent to users through environment modules for these container images. Finally, it will discuss how we deployed applications with a graphical user interface (GUI) to Open OnDemand as interactive applications, how we modified Python-based container images to support Jupyter notebooks, and how we generated detailed usage documentation for each application on the ReadTheDocs platform. The sum of these contributions provides a robust and reproducible computing ecosystem for life science researchers. The general approach outlined in this paper is easily adaptable to utilize any underlying container technology for any collection of applications.","PeriodicalId":308756,"journal":{"name":"2022 IEEE/ACM International Workshop on HPC User Support Tools (HUST)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129543062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"NERSC Job Script Generator","authors":"Robin Shao, T. Kurth, Zhengji Zhao","doi":"10.1109/HUST56722.2022.00010","DOIUrl":"https://doi.org/10.1109/HUST56722.2022.00010","url":null,"abstract":"NERSC is the primary scientific computing facility for DOE's Office of Science. NERSC supports diverse production workloads across a wide range of scientific disciplines, which requires a rather complicated queue structure with various resource limits and priorities. It has been challenging for users to generate proper job scripts to optimally use the systems. We developed a Slurm job script generator, a web application to help users not only generate job scripts but also learn how the batch system works. The job script generator was first deployed in 2016 to help generate an optimal process/threads affinity for the hybrid MPI + OpenMP applications for NERSC's Cori system, and was recently extended to support more systems and use cases. In this talk, we will present the features supported in our job script generator, and describe the code design and implementation, which is easily adaptable to other centers who deploy Slurm.","PeriodicalId":308756,"journal":{"name":"2022 IEEE/ACM International Workshop on HPC User Support Tools (HUST)","volume":"223 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114402982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
G. Bauer, Albert Bode, Brett M. Bode, William T. C. Kramer, C. Mendes, Aaron Saxton
{"title":"Analysis of User-Support Tickets in the Lifetime of the Blue Waters System","authors":"G. Bauer, Albert Bode, Brett M. Bode, William T. C. Kramer, C. Mendes, Aaron Saxton","doi":"10.1109/HUST56722.2022.00008","DOIUrl":"https://doi.org/10.1109/HUST56722.2022.00008","url":null,"abstract":"We present an analysis of the collection of user support tickets created during nearly nine years of operation of the Blue Waters supercomputer. The analysis is based on information obtained from the Jira ticketing system and its corresponding queues. The paper contains a set of statistics showing, in quantitative form, the distribution of tickets across system areas. It also shows the computed metrics related to management of tickets by our staff. Additionally, we present an analysis, based on Machine Learning and Sentiment Analysis techniques, conducted over the text entered in tickets, targeting detecting trends on users' views and perspectives about the Blue Waters system. This kind of study, which is uncommon in the literature, could provide guidance for operators of future large systems about the expected volume of user support demanded by each system area, and about how to allocate support staff such that users receive the best possible assistance.","PeriodicalId":308756,"journal":{"name":"2022 IEEE/ACM International Workshop on HPC User Support Tools (HUST)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116734240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"HUST22 Workshop Organization","authors":"","doi":"10.1109/hust56722.2022.00005","DOIUrl":"https://doi.org/10.1109/hust56722.2022.00005","url":null,"abstract":"","PeriodicalId":308756,"journal":{"name":"2022 IEEE/ACM International Workshop on HPC User Support Tools (HUST)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131272375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zifan Nan, Mithil Dave, Xipeng Shen, C. Liao, T. Vanderbruggen, Pei-Hung Lin, M. Emani
{"title":"Interactive NLU-Powered Ontology-Based Workflow Synthesis for FAIR Support of HPC","authors":"Zifan Nan, Mithil Dave, Xipeng Shen, C. Liao, T. Vanderbruggen, Pei-Hung Lin, M. Emani","doi":"10.1109/HUST56722.2022.00009","DOIUrl":"https://doi.org/10.1109/HUST56722.2022.00009","url":null,"abstract":"Workflow synthesis is important for automatically creating the data processing workflow in a FAIR data management system for HPC. Previous methods are table-based, rigid and not scalable. This paper addresses these limitations by developing a new approach to workflow synthesis, interactive NLU-powered ontology-based workflow synthesis (INPOWS). IN-POWS allows the use of Natural Language for queries, maximizes the robustness in handling concepts and language ambiguities through an interactive ontology-based design, and achieves superior extensibility by adopting a synthesis algorithm powered by Natural Language Understanding. In our experiments, INPOWS shows the efficacy in enabling flexible, robust, and extensible workflow synthesis.","PeriodicalId":308756,"journal":{"name":"2022 IEEE/ACM International Workshop on HPC User Support Tools (HUST)","volume":"97 5-6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114034201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Message from the HUST 22 Workshop Chairs","authors":"","doi":"10.1109/hust56722.2022.00004","DOIUrl":"https://doi.org/10.1109/hust56722.2022.00004","url":null,"abstract":"","PeriodicalId":308756,"journal":{"name":"2022 IEEE/ACM International Workshop on HPC User Support Tools (HUST)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122749475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PMT: Power Measurement Toolkit","authors":"Stefano Corda, B. Veenboer, Emma Tolley","doi":"10.1109/HUST56722.2022.00011","DOIUrl":"https://doi.org/10.1109/HUST56722.2022.00011","url":null,"abstract":"Efficient use of energy is essential for today's super-computing systems, as energy cost is generally a major component of their operational cost. Research into “green computing” is needed to reduce the environmental impact of running these systems. As such, several scientific communities are evaluating the trade-off between time-to-solution and energy-to-solution. While the runtime of an application is typically easy to measure, power consumption is not. Therefore, we present the Power Measurement Toolkit (PMT), a high-level software library capable of collecting power consumption measurements on various hardware. The library provides a standard interface to easily measure the energy use of devices such as CPUs and GPUs in critical application sections.","PeriodicalId":308756,"journal":{"name":"2022 IEEE/ACM International Workshop on HPC User Support Tools (HUST)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124873076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}