GYAN: Accelerating Bioinformatics Tools in Galaxy with GPU-Aware Computation Mapping

Gulsum Gudukbay, J. Gunasekaran, Yilin Feng, M. Kandemir, A. Nekrutenko, C. Das, P. Medvedev, B. Grüning, Nate Coraor, Nathan P Roach, E. Afgan
{"title":"GYAN: Accelerating Bioinformatics Tools in Galaxy with GPU-Aware Computation Mapping","authors":"Gulsum Gudukbay, J. Gunasekaran, Yilin Feng, M. Kandemir, A. Nekrutenko, C. Das, P. Medvedev, B. Grüning, Nate Coraor, Nathan P Roach, E. Afgan","doi":"10.1109/IPDPSW52791.2021.00037","DOIUrl":null,"url":null,"abstract":"Galaxy is an open-source web-based framework that is widely used for performing computational analyses in diverse application domains, such as genome assembly, computational chemistry, ecology, and epigenetics, to name a few. The current Galaxy software framework runs on several high-performance computing platforms such as on-premise clusters, public data centers, and national lab supercomputers. These infrastructures also provide support for state-of-the-art accelerators like Graphical Processing Units (GPUs). When coupled with accelerator support, the tools executing in Galaxy can benefit from massive performance gains in terms of computation time, thereby allowing a more robust computational analysis environment for researchers. Despite tools having GPU capabilities, the current Galaxy framework does not support GPUs, and thus prevents tools from taking advantage of the performance benefits offered by GPUs. We present and experimentally evaluate GYAN, a GPU-aware computation mapping and orchestration functionality implemented in Galaxy that allows the Galaxy tools to be executed on a GPU-enabled cluster. GYAN has the capability of identifying GPU-supported tools and scheduling them on single or multiple GPU nodes based on the availability in the cluster. GYAN supports both native and containerized tool execution. We performed extensive evaluations of the implementation using popular bio-engineering tools to demonstrate the benefits of using GPU technologies. For example, the Racon consensus tool executes ~2× faster than the regular baseline CPU-only jobs, while the Bonito base calling tool shows ~50× speedup.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW52791.2021.00037","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Galaxy is an open-source web-based framework that is widely used for performing computational analyses in diverse application domains, such as genome assembly, computational chemistry, ecology, and epigenetics, to name a few. The current Galaxy software framework runs on several high-performance computing platforms such as on-premise clusters, public data centers, and national lab supercomputers. These infrastructures also provide support for state-of-the-art accelerators like Graphical Processing Units (GPUs). When coupled with accelerator support, the tools executing in Galaxy can benefit from massive performance gains in terms of computation time, thereby allowing a more robust computational analysis environment for researchers. Despite tools having GPU capabilities, the current Galaxy framework does not support GPUs, and thus prevents tools from taking advantage of the performance benefits offered by GPUs. We present and experimentally evaluate GYAN, a GPU-aware computation mapping and orchestration functionality implemented in Galaxy that allows the Galaxy tools to be executed on a GPU-enabled cluster. GYAN has the capability of identifying GPU-supported tools and scheduling them on single or multiple GPU nodes based on the availability in the cluster. GYAN supports both native and containerized tool execution. We performed extensive evaluations of the implementation using popular bio-engineering tools to demonstrate the benefits of using GPU technologies. For example, the Racon consensus tool executes ~2× faster than the regular baseline CPU-only jobs, while the Bonito base calling tool shows ~50× speedup.
GYAN:利用gpu感知计算映射加速银河系生物信息学工具
Galaxy是一个基于web的开源框架,广泛用于在不同的应用领域进行计算分析,例如基因组组装、计算化学、生态学和表观遗传学等。当前的Galaxy软件框架运行在多个高性能计算平台上,如本地集群、公共数据中心和国家实验室超级计算机。这些基础设施还为图形处理单元(gpu)等最先进的加速器提供支持。当与加速器支持相结合时,在Galaxy中执行的工具可以从计算时间方面的巨大性能提升中获益,从而为研究人员提供更强大的计算分析环境。尽管工具具有GPU功能,但当前的Galaxy框架不支持GPU,因此阻止了工具利用GPU提供的性能优势。我们提出并实验评估GYAN,这是一种在Galaxy中实现的gpu感知计算映射和编排功能,允许Galaxy工具在支持gpu的集群上执行。GYAN能够识别GPU支持的工具,并根据集群中的可用性在单个或多个GPU节点上调度它们。GYAN支持本机和容器化的工具执行。我们使用流行的生物工程工具对实现进行了广泛的评估,以展示使用GPU技术的好处。例如,Racon共识工具的执行速度比常规的仅使用cpu的基准作业快2倍,而Bonito基础调用工具的执行速度提高了50倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信