Back translated peptide K-mer search and local alignment in large DNA sequence databases using BoND-SD-tree indexing

A. T. Islam, S. Pramanik, Xinge Ji, J. Cole, Qiang Zhu
{"title":"Back translated peptide K-mer search and local alignment in large DNA sequence databases using BoND-SD-tree indexing","authors":"A. T. Islam, S. Pramanik, Xinge Ji, J. Cole, Qiang Zhu","doi":"10.1109/BIBE.2015.7367638","DOIUrl":null,"url":null,"abstract":"In the past, genome sequence databases had used main memory indexing, such as the suffix tree, for fast sequence searches. With next generation sequencing technologies, the amount of sequence data being generated is huge and main memory indexing is limited by the amount of memory available. K-mer based techniques are being more used for various genome sequence database applications such as local alignment. K-mer can also provide an excellent basis for creating efficient disk based indexing. In this paper, we have proposed a k-mer based database searching and local alignment tool using box queries on BoND-SD-tree indexing. BoND-tree is quite efficient for indexing and searching in Non-Ordered Discrete Data Space (NDDS). We have conducted experiments on searching DNA sequence databases using back translated protein query sequences and have compared with existing methods. We have also implemented local alignment of back translated protein query sequences with large DNA sequence databases using this index based k-mer search. Performances of this local alignment approach has been compared with that of Tblastn of NCBI. The results are quite promising and justify significance of the proposed approach.","PeriodicalId":422807,"journal":{"name":"2015 IEEE 15th International Conference on Bioinformatics and Bioengineering (BIBE)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 15th International Conference on Bioinformatics and Bioengineering (BIBE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2015.7367638","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

In the past, genome sequence databases had used main memory indexing, such as the suffix tree, for fast sequence searches. With next generation sequencing technologies, the amount of sequence data being generated is huge and main memory indexing is limited by the amount of memory available. K-mer based techniques are being more used for various genome sequence database applications such as local alignment. K-mer can also provide an excellent basis for creating efficient disk based indexing. In this paper, we have proposed a k-mer based database searching and local alignment tool using box queries on BoND-SD-tree indexing. BoND-tree is quite efficient for indexing and searching in Non-Ordered Discrete Data Space (NDDS). We have conducted experiments on searching DNA sequence databases using back translated protein query sequences and have compared with existing methods. We have also implemented local alignment of back translated protein query sequences with large DNA sequence databases using this index based k-mer search. Performances of this local alignment approach has been compared with that of Tblastn of NCBI. The results are quite promising and justify significance of the proposed approach.
使用BoND-SD-tree索引在大型DNA序列数据库中进行反向翻译肽K-mer搜索和局部比对
在过去,基因组序列数据库使用主存储器索引,如后缀树,以实现快速序列搜索。随着下一代测序技术的发展,产生的序列数据量是巨大的,主存索引受到可用内存量的限制。基于K-mer的技术越来越多地用于各种基因组序列数据库应用,如局部比对。K-mer还可以为创建高效的基于磁盘的索引提供良好的基础。在本文中,我们提出了一个基于k-mer的数据库搜索和本地对齐工具,该工具使用bond - sd树索引上的框查询。BoND-tree在无序离散数据空间(NDDS)中是一种高效的索引和搜索方法。我们进行了利用回译蛋白查询序列搜索DNA序列数据库的实验,并与现有方法进行了比较。我们还使用基于索引的k-mer搜索实现了与大型DNA序列数据库的反向翻译蛋白质查询序列的局部比对。将该方法与nbi的Tblastn方法进行了性能比较。结果是相当有希望的,并证明了所提出的方法的意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信