Cloud-based parallel suffix array construction based on MPI

A. Abdelhadi, A. Kandil, M. Abouelhoda
{"title":"Cloud-based parallel suffix array construction based on MPI","authors":"A. Abdelhadi, A. Kandil, M. Abouelhoda","doi":"10.1109/MECBME.2014.6783271","DOIUrl":null,"url":null,"abstract":"Massive amount of genomics data are being produced nowadays by Next Generation Sequencing machines. The suffix array is currently the best choice for indexing genomics data, because of its efficiency and large number of applications. In this paper, we address the problem of constructing the suffix array on computer cluster in the cloud. We present a solution that automates the establishment of a computer cluster in a cloud and automatically constructs the suffix array in a distributed fashion over the cluster nodes. This has the advantage of encapsulating all set-up details and execution of the algorithm. The distributed nature of the algorithm we use overcomes the problem that arises when the user wishes, due to cost issues, to use low memory machines in the cloud. Our experiments show that our implementation scales well with the increasing number of processors. The cloud cost is affordable and it provides a cost effective solution.","PeriodicalId":384055,"journal":{"name":"2nd Middle East Conference on Biomedical Engineering","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2nd Middle East Conference on Biomedical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MECBME.2014.6783271","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

Massive amount of genomics data are being produced nowadays by Next Generation Sequencing machines. The suffix array is currently the best choice for indexing genomics data, because of its efficiency and large number of applications. In this paper, we address the problem of constructing the suffix array on computer cluster in the cloud. We present a solution that automates the establishment of a computer cluster in a cloud and automatically constructs the suffix array in a distributed fashion over the cluster nodes. This has the advantage of encapsulating all set-up details and execution of the algorithm. The distributed nature of the algorithm we use overcomes the problem that arises when the user wishes, due to cost issues, to use low memory machines in the cloud. Our experiments show that our implementation scales well with the increasing number of processors. The cloud cost is affordable and it provides a cost effective solution.
基于MPI的云并行后缀数组构建
如今,下一代测序仪正在产生大量的基因组学数据。后缀阵列由于其高效和广泛的应用,是目前基因组数据索引的最佳选择。本文研究了在云环境下计算机集群上构造后缀数组的问题。我们提出了一个在云中自动建立计算机集群并在集群节点上以分布式方式自动构造后缀数组的解决方案。这样做的优点是封装了算法的所有设置细节和执行。我们使用的算法的分布式特性克服了用户由于成本问题而希望在云中使用低内存机器时出现的问题。我们的实验表明,随着处理器数量的增加,我们的实现可以很好地扩展。云计算的成本是可以承受的,并且它提供了一种具有成本效益的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信