Parallel Accelerating Ultra-Long Read Alignment by Vertical Partitioning Data

Deng Pan, Cheng Zhong, Danyang Chen, Jinxiong Zhang, Feng Yang
{"title":"Parallel Accelerating Ultra-Long Read Alignment by Vertical Partitioning Data","authors":"Deng Pan, Cheng Zhong, Danyang Chen, Jinxiong Zhang, Feng Yang","doi":"10.1109/PAAP56126.2022.10010526","DOIUrl":null,"url":null,"abstract":"The alignment between sequencing reads and genome is a basic work in biological big data analysis. Each read of the third generation sequencing data is getting longer, and the data size is getting larger. To effectively solve the ultra-long read alignment problem with high requirements for computing and memory capacity, a strategy for vertical partitioning ultra-long reads on hybrid CPU/GPU cluster is proposed, and a heap data structure is used to filter the local aligned results in all computing nodes of the parallel cluster system according to the alignment score to reduce the data transmission size. The methods for early termination and parallel merging-splicing are used to accelerate splicing local aligned results. The local aligned results among all computing nodes are collected and extended to obtain the final alignment results. The experimental results on datasets of simulated and real ultra-long reads show that the proposed parallel alignment algorithm can obtain high alignment accuracy, sensitivity and base-level sensitivity as a whole, and accelerate completing alignment between ultra-long reads and reference genome.","PeriodicalId":336339,"journal":{"name":"2022 IEEE 13th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 13th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PAAP56126.2022.10010526","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The alignment between sequencing reads and genome is a basic work in biological big data analysis. Each read of the third generation sequencing data is getting longer, and the data size is getting larger. To effectively solve the ultra-long read alignment problem with high requirements for computing and memory capacity, a strategy for vertical partitioning ultra-long reads on hybrid CPU/GPU cluster is proposed, and a heap data structure is used to filter the local aligned results in all computing nodes of the parallel cluster system according to the alignment score to reduce the data transmission size. The methods for early termination and parallel merging-splicing are used to accelerate splicing local aligned results. The local aligned results among all computing nodes are collected and extended to obtain the final alignment results. The experimental results on datasets of simulated and real ultra-long reads show that the proposed parallel alignment algorithm can obtain high alignment accuracy, sensitivity and base-level sensitivity as a whole, and accelerate completing alignment between ultra-long reads and reference genome.
垂直分区数据并行加速超长读对齐
测序reads与基因组比对是生物大数据分析的基础工作。第三代测序数据的每次读取时间越来越长,数据量也越来越大。为了有效解决对计算和内存容量要求较高的超长读对齐问题,提出了在CPU/GPU混合集群上对超长读进行垂直分区的策略,并采用堆数据结构根据对齐分数对并行集群系统各计算节点的局部对齐结果进行过滤,以减小数据传输量。采用提前终止和并行拼接的方法,加快拼接局部对齐的速度。收集所有计算节点之间的局部对齐结果并进行扩展,得到最终对齐结果。在模拟和真实超长reads数据集上的实验结果表明,所提出的平行比对算法总体上具有较高的比对精度、灵敏度和碱基级灵敏度,能够加速完成超长reads与参考基因组的比对。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信