ISC Workshops最新文献

筛选
英文 中文
Challenges and Opportunities for RISC-V Architectures towards Genomics-based Workloads 面向基因组工作负载的RISC-V架构的挑战与机遇
ISC Workshops Pub Date : 2023-06-27 DOI: 10.48550/arXiv.2306.15562
Gonzalo Gómez-Sánchez, A. Call, Xavier Teruel, Lorena Alonso, Ignasi Morán, Miguel Angel Perez, D. Torrents, J. L. Berral
{"title":"Challenges and Opportunities for RISC-V Architectures towards Genomics-based Workloads","authors":"Gonzalo Gómez-Sánchez, A. Call, Xavier Teruel, Lorena Alonso, Ignasi Morán, Miguel Angel Perez, D. Torrents, J. L. Berral","doi":"10.48550/arXiv.2306.15562","DOIUrl":"https://doi.org/10.48550/arXiv.2306.15562","url":null,"abstract":"The use of large-scale supercomputing architectures is a hard requirement for scientific computing Big-Data applications. An example is genomics analytics, where millions of data transformations and tests per patient need to be done to find relevant clinical indicators. Therefore, to ensure open and broad access to high-performance technologies, governments, and academia are pushing toward the introduction of novel computing architectures in large-scale scientific environments. This is the case of RISC-V, an open-source and royalty-free instruction-set architecture. To evaluate such technologies, here we present the Variant-Interaction Analytics use case benchmarking suite and datasets. Through this use case, we search for possible genetic interactions using computational and statistical methods, providing a representative case for heavy ETL (Extract, Transform, Load) data processing. Current implementations are implemented in x86-based supercomputers (e.g. MareNostrum-IV at the Barcelona Supercomputing Center (BSC)), and future steps propose RISC-V as part of the next MareNostrum generations. Here we describe the Variant Interaction Use Case, highlighting the characteristics leveraging high-performance computing, indicating the caveats and challenges towards the next RISC-V developments and designs to come from a first comparison between x86 and RISC-V architectures on real Variant Interaction executions over real hardware implementations.","PeriodicalId":345133,"journal":{"name":"ISC Workshops","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116139531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Software Development Vehicles to enable extended and early co-design: a RISC-V and HPC case of study 支持扩展和早期协同设计的软件开发工具:RISC-V和HPC案例研究
ISC Workshops Pub Date : 2023-06-01 DOI: 10.48550/arXiv.2306.01797
F. Mantovani, Pablo Vizcaino, Fabio Banchelli, M. Garcia-Gasulla, R. Ferrer, Giorgos Ieronymakis, Nikos Dimou, Vassilis D. Papaefstathiou, Jesús Labarta
{"title":"Software Development Vehicles to enable extended and early co-design: a RISC-V and HPC case of study","authors":"F. Mantovani, Pablo Vizcaino, Fabio Banchelli, M. Garcia-Gasulla, R. Ferrer, Giorgos Ieronymakis, Nikos Dimou, Vassilis D. Papaefstathiou, Jesús Labarta","doi":"10.48550/arXiv.2306.01797","DOIUrl":"https://doi.org/10.48550/arXiv.2306.01797","url":null,"abstract":"Prototyping HPC systems with low-to-mid technology readiness level (TRL) systems is critical for providing feedback to hardware designers, the system software team (e.g., compiler developers), and early adopters from the scientific community. The typical approach to hardware design and HPC system prototyping often limits feedback or only allows it at a late stage. In this paper, we present a set of tools for co-designing HPC systems, called software development vehicles (SDV). We use an innovative RISC-V design as a demonstrator, which includes a scalar CPU and a vector processing unit capable of operating large vectors up to 16 kbits. We provide an incremental methodology and early tangible evidence of the co-design process that provide feedback to improve both architecture and system software at a very early stage of system development.","PeriodicalId":345133,"journal":{"name":"ISC Workshops","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127746546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Backporting RISC-V Vector assembly 支持RISC-V矢量组件
ISC Workshops Pub Date : 2023-04-20 DOI: 10.48550/arXiv.2304.10324
Joseph K. L. Lee, Maurice Jamieson, Nick Brown
{"title":"Backporting RISC-V Vector assembly","authors":"Joseph K. L. Lee, Maurice Jamieson, Nick Brown","doi":"10.48550/arXiv.2304.10324","DOIUrl":"https://doi.org/10.48550/arXiv.2304.10324","url":null,"abstract":"Leveraging vectorisation, the ability for a CPU to apply operations to multiple elements of data concurrently, is critical for high performance workloads. However, at the time of writing, commercially available physical RISC-V hardware that provides the RISC-V vector extension (RVV) only supports version 0.7.1, which is incompatible with the latest ratified version 1.0. The challenge is that upstream compiler toolchains, such as Clang, only target the ratified v1.0 and do not support the older v0.7.1. Because v1.0 is not compatible with v0.7.1, the only way to program vectorised code is to use a vendor-provided, older compiler. In this paper we introduce the rvv-rollback tool which translates assembly code generated by the compiler using vector extension v1.0 instructions to v0.7.1. We utilise this tool to compare vectorisation performance of the vendor-provided GNU 8.4 compiler (supports v0.7.1) against LLVM 15.0 (supports only v1.0), where we found that the LLVM compiler is capable of auto-vectorising more computational kernels, and delivers greater performance than GNU in most, but not all, cases. We also tested LLVM vectorisation with vector length agnostic and specific settings, and observed cases with significant difference in performance.","PeriodicalId":345133,"journal":{"name":"ISC Workshops","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116499860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Test-driving RISC-V Vector hardware for HPC 测试驱动RISC-V矢量硬件的HPC
ISC Workshops Pub Date : 2023-04-20 DOI: 10.48550/arXiv.2304.10319
Joseph K. L. Lee, Maurice Jamieson, Nick Brown, Ricardo Jesus
{"title":"Test-driving RISC-V Vector hardware for HPC","authors":"Joseph K. L. Lee, Maurice Jamieson, Nick Brown, Ricardo Jesus","doi":"10.48550/arXiv.2304.10319","DOIUrl":"https://doi.org/10.48550/arXiv.2304.10319","url":null,"abstract":"Whilst the RISC-V Vector extension (RVV) has been ratified, at the time of writing both hardware implementations and open source software support are still limited for vectorisation on RISC-V. This is important because vectorisation is crucial to obtaining good performance for High Performance Computing (HPC) workloads and, as of April 2023, the Allwinner D1 SoC, containing the XuanTie C906 processor, is the only mass-produced and commercially available hardware supporting RVV. This paper surveys the current state of RISC-V vectorisation as of 2023, reporting the landscape of both the hardware and software ecosystem. Driving our discussion from experiences in setting up the Allwinner D1 as part of the EPCC RISC-V testbed, we report the results of benchmarking the Allwinner D1 using the RAJA Performance Suite, which demonstrated reasonable vectorisation speedup using vendor-provided compiler, as well as favourable performance compared to the StarFive VisionFive V2 with SiFive's U74 processor.","PeriodicalId":345133,"journal":{"name":"ISC Workshops","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116493164","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Portability and Scalability of OpenMP Offloading on State-of-the-art Accelerators 最先进加速器上OpenMP卸载的可移植性和可扩展性
ISC Workshops Pub Date : 2023-04-09 DOI: 10.48550/arXiv.2304.04276
Yehonatan Fridman, G. Tamir, Gal Oren
{"title":"Portability and Scalability of OpenMP Offloading on State-of-the-art Accelerators","authors":"Yehonatan Fridman, G. Tamir, Gal Oren","doi":"10.48550/arXiv.2304.04276","DOIUrl":"https://doi.org/10.48550/arXiv.2304.04276","url":null,"abstract":"Over the last decade, most of the increase in computing power has been gained by advances in accelerated many-core architectures, mainly in the form of GPGPUs. While accelerators achieve phenomenal performances in various computing tasks, their utilization requires code adaptations and transformations. Thus, OpenMP, the most common standard for multi-threading in scientific computing applications, introduced offloading capabilities between host (CPUs) and accelerators since v4.0, with increasing support in the successive v4.5, v5.0, v5.1, and the latest v5.2 versions. Recently, two state-of-the-art GPUs -- the Intel Ponte Vecchio Max 1100 and the NVIDIA A100 GPUs -- were released to the market, with the oneAPI and NVHPC compilers for offloading, correspondingly. In this work, we present early performance results of OpenMP offloading capabilities to these devices while specifically analyzing the portability of advanced directives (using SOLLVE's OMPVV test suite) and the scalability of the hardware in representative scientific mini-app (the LULESH benchmark). Our results show that the coverage for version 4.5 is nearly complete in both latest NVHPC and oneAPI tools. However, we observed a lack of support in versions 5.0, 5.1, and 5.2, which is particularly noticeable when using NVHPC. From the performance perspective, we found that the PVC1100 and A100 are relatively comparable on the LULESH benchmark. While the A100 is slightly better due to faster memory bandwidth, the PVC1100 reaches the next problem size (400^3) scalably due to the larger memory size.","PeriodicalId":345133,"journal":{"name":"ISC Workshops","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127897643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Precise Energy Consumption Measurements of Heterogeneous Artificial Intelligence Workloads 异构人工智能工作负载的精确能耗测量
ISC Workshops Pub Date : 2022-12-03 DOI: 10.48550/arXiv.2212.01698
R. Caspart, Sebastian Ziegler, Arvid Weyrauch, Holger Obermaier, Simon Raffeiner, Leonie Schuhmacher, J. Scholtyssek, D. Trofimova, M. Nolden, I. Reinartz, Fabian Isensee, Markus Goetz, C. Debus
{"title":"Precise Energy Consumption Measurements of Heterogeneous Artificial Intelligence Workloads","authors":"R. Caspart, Sebastian Ziegler, Arvid Weyrauch, Holger Obermaier, Simon Raffeiner, Leonie Schuhmacher, J. Scholtyssek, D. Trofimova, M. Nolden, I. Reinartz, Fabian Isensee, Markus Goetz, C. Debus","doi":"10.48550/arXiv.2212.01698","DOIUrl":"https://doi.org/10.48550/arXiv.2212.01698","url":null,"abstract":". With the rise of artificial intelligence (AI) in recent years and the subsequent increase in complexity of the applied models, the growing demand in computational resources is starting to pose a signif-icant challenge. The need for higher compute power is being met with increasingly more potent accelerator hardware as well as the use of large and powerful compute clusters. However, the gain in prediction accuracy from large models trained on distributed and accelerated systems ulti-mately comes at the price of a substantial increase in energy demand, and researchers have started questioning the environmental friendliness of such AI methods at scale. Consequently, awareness of energy efficiency plays an important role for AI model developers and hardware infrastructure operators likewise. The energy consumption of AI workloads depends both on the model implementation and the composition of the utilized hardware. Therefore, accurate measurements of the power draw of respective AI workflows on different types of compute nodes is key to algorithmic improvements and the design of future compute clusters and hardware. Towards this end, we present measurements of the energy consumption of two typical applications of deep learning models on different types of heterogeneous compute nodes. Our results indicate that 1. contrary to common approaches, deriving energy consumption directly from runtime is not accurate, but the consumption of the compute node needs to be considered regarding its composition; 2. neglecting accelerator hardware on mixed nodes results in overproportional ineffi-ciency regarding energy consumption; 3. energy consumption of model training and inference should be considered separately – while training on GPUs outperforms all other node types regarding both runtime and energy consumption, inference on CPU nodes can be comparably efficient. One advantage of our approach is the fact that the information on energy consumption is available to all users of the supercomputer and not just those with administrator rights, enabling an easy transfer to other workloads alongside a raise in user-awareness of energy consumption.","PeriodicalId":345133,"journal":{"name":"ISC Workshops","volume":"142 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125602079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Workflows to driving high-performance interactive supercomputing for urgent decision making 为紧急决策制定驱动高性能交互式超级计算的工作流程
ISC Workshops Pub Date : 2022-06-28 DOI: 10.48550/arXiv.2206.14103
Nick Brown, R. Nash, G. Gibb, E. Belikov, Artur Podobas, W. Chien, S. Markidis, M. Flatken, A. Gerndt
{"title":"Workflows to driving high-performance interactive supercomputing for urgent decision making","authors":"Nick Brown, R. Nash, G. Gibb, E. Belikov, Artur Podobas, W. Chien, S. Markidis, M. Flatken, A. Gerndt","doi":"10.48550/arXiv.2206.14103","DOIUrl":"https://doi.org/10.48550/arXiv.2206.14103","url":null,"abstract":". Interactive urgent computing is a small but growing user of supercomputing resources. However there are numerous technical challenges that must be overcome to make supercomputers fully suited to the wide range of urgent workloads which could benefit from the computational power delivered by such instruments. An important question is how to connect the different components of an urgent workload; namely the users, the simulation codes, and external data sources, together in a structured and accessible manner. In this paper we explore the role of workflows from both the perspective of marshalling and control of urgent workloads, and at the individual HPC machine level. Ultimately requiring two workflow systems, by using a space weather prediction urgent use-cases, we explore the benefit that these two workflow systems provide especially when one exploits the flexibility enabled by them interoperating.","PeriodicalId":345133,"journal":{"name":"ISC Workshops","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127585912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic Tuning of Tensorflow's CPU Backend using Gradient-Free Optimization Algorithms 使用无梯度优化算法自动调整Tensorflow的CPU后端
ISC Workshops Pub Date : 2021-09-13 DOI: 10.1007/978-3-030-90539-2_17
Derssie Mebratu, N. Hasabnis, Pietro Mercati, Gaurit Sharma, S. Najnin
{"title":"Automatic Tuning of Tensorflow's CPU Backend using Gradient-Free Optimization Algorithms","authors":"Derssie Mebratu, N. Hasabnis, Pietro Mercati, Gaurit Sharma, S. Najnin","doi":"10.1007/978-3-030-90539-2_17","DOIUrl":"https://doi.org/10.1007/978-3-030-90539-2_17","url":null,"abstract":"","PeriodicalId":345133,"journal":{"name":"ISC Workshops","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128074663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Negative Perceptions About the Applicability of Source-to-Source Compilers in HPC: A Literature Review 对HPC中源对源编译器适用性的负面看法:文献综述
ISC Workshops Pub Date : 2021-07-01 DOI: 10.1007/978-3-030-90539-2_16
Reed Milewicz, P. Pirkelbauer, Prema Soundararajan, H. Ahmed, A. Skjellum
{"title":"Negative Perceptions About the Applicability of Source-to-Source Compilers in HPC: A Literature Review","authors":"Reed Milewicz, P. Pirkelbauer, Prema Soundararajan, H. Ahmed, A. Skjellum","doi":"10.1007/978-3-030-90539-2_16","DOIUrl":"https://doi.org/10.1007/978-3-030-90539-2_16","url":null,"abstract":"","PeriodicalId":345133,"journal":{"name":"ISC Workshops","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129859667","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Lettuce: PyTorch-based Lattice Boltzmann Framework Lettuce:基于pytorch的晶格玻尔兹曼框架
ISC Workshops Pub Date : 2021-06-24 DOI: 10.1007/978-3-030-90539-2_3
Mario Bedrunka, D. Wilde, Martin L. Kliemank, D. Reith, H. Foysi, Andreas Krämer
{"title":"Lettuce: PyTorch-based Lattice Boltzmann Framework","authors":"Mario Bedrunka, D. Wilde, Martin L. Kliemank, D. Reith, H. Foysi, Andreas Krämer","doi":"10.1007/978-3-030-90539-2_3","DOIUrl":"https://doi.org/10.1007/978-3-030-90539-2_3","url":null,"abstract":"","PeriodicalId":345133,"journal":{"name":"ISC Workshops","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115676863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信