高效和可扩展的基因组分析工作流程

Subho Sankar Banerjee, A. Athreya, L. S. Mainzer, C. Jongeneel, Wen-mei W. Hwu, Z. Kalbarczyk, R. Iyer
{"title":"高效和可扩展的基因组分析工作流程","authors":"Subho Sankar Banerjee, A. Athreya, L. S. Mainzer, C. Jongeneel, Wen-mei W. Hwu, Z. Kalbarczyk, R. Iyer","doi":"10.1145/2912152.2912156","DOIUrl":null,"url":null,"abstract":"Recent growth in the volume of DNA sequence data and associated computational costs of extracting meaningful information warrants the need for efficient computational systems at-scale. In this work, we propose the Illinois Genomics Execution Environment (IGen), a framework for efficient and scalable genome analyses. The design philosophy of IGen is based on algorithmic analysis and extensive measurements on compute- and data-intensive genomic analyses workflows (such as variant discovery and genotyping analysis) executed on high-performance and cloud computing infrastructures. IGen leverages the advantages of existing designs and proposes new software improvements to overcome the ine ciencies we observe in our measurements. Based on these composite improvements, we demonstrate that IGen is able to accelerate the alignment from 13.1 hours to 10.8 hours (1.2x) and the variant from 10.1 hours to 1.25 hours (8x) calling on a single node, and its modular design scales e ciently in a parallel computing environment.","PeriodicalId":443897,"journal":{"name":"Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Efficient and Scalable Workflows for Genomic Analyses\",\"authors\":\"Subho Sankar Banerjee, A. Athreya, L. S. Mainzer, C. Jongeneel, Wen-mei W. Hwu, Z. Kalbarczyk, R. Iyer\",\"doi\":\"10.1145/2912152.2912156\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent growth in the volume of DNA sequence data and associated computational costs of extracting meaningful information warrants the need for efficient computational systems at-scale. In this work, we propose the Illinois Genomics Execution Environment (IGen), a framework for efficient and scalable genome analyses. The design philosophy of IGen is based on algorithmic analysis and extensive measurements on compute- and data-intensive genomic analyses workflows (such as variant discovery and genotyping analysis) executed on high-performance and cloud computing infrastructures. IGen leverages the advantages of existing designs and proposes new software improvements to overcome the ine ciencies we observe in our measurements. Based on these composite improvements, we demonstrate that IGen is able to accelerate the alignment from 13.1 hours to 10.8 hours (1.2x) and the variant from 10.1 hours to 1.25 hours (8x) calling on a single node, and its modular design scales e ciently in a parallel computing environment.\",\"PeriodicalId\":443897,\"journal\":{\"name\":\"Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing\",\"volume\":\"75 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2912152.2912156\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2912152.2912156","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

最近DNA序列数据量的增长和提取有意义信息的相关计算成本保证了对大规模高效计算系统的需求。在这项工作中,我们提出了伊利诺伊州基因组学执行环境(IGen),这是一个高效和可扩展的基因组分析框架。IGen的设计理念是基于在高性能和云计算基础设施上执行的算法分析和对计算和数据密集型基因组分析工作流程(如变异发现和基因分型分析)的广泛测量。IGen利用现有设计的优势,并提出新的软件改进,以克服我们在测量中观察到的线型。基于这些综合改进,我们证明了IGen能够在单个节点上将对齐时间从13.1小时加速到10.8小时(1.2倍),从10.1小时加速到1.25小时(8倍),并且其模块化设计在并行计算环境下可以有效地扩展。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Efficient and Scalable Workflows for Genomic Analyses
Recent growth in the volume of DNA sequence data and associated computational costs of extracting meaningful information warrants the need for efficient computational systems at-scale. In this work, we propose the Illinois Genomics Execution Environment (IGen), a framework for efficient and scalable genome analyses. The design philosophy of IGen is based on algorithmic analysis and extensive measurements on compute- and data-intensive genomic analyses workflows (such as variant discovery and genotyping analysis) executed on high-performance and cloud computing infrastructures. IGen leverages the advantages of existing designs and proposes new software improvements to overcome the ine ciencies we observe in our measurements. Based on these composite improvements, we demonstrate that IGen is able to accelerate the alignment from 13.1 hours to 10.8 hours (1.2x) and the variant from 10.1 hours to 1.25 hours (8x) calling on a single node, and its modular design scales e ciently in a parallel computing environment.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信