通过变换过滤字符串邻近搜索

S. Aghili, D. Agrawal, A. E. Abbadi
{"title":"通过变换过滤字符串邻近搜索","authors":"S. Aghili, D. Agrawal, A. E. Abbadi","doi":"10.1109/BIBE.2003.1188941","DOIUrl":null,"url":null,"abstract":"The problem of proximity search in biological databases is addressed. We study vector transformations and conduct the application of DFT (Discrete Fourier Transformation) and DWT (Discrete Wavelet Transformation, Haar) dimensionality reduction techniques for DNA sequence proximity search to reduce the search time of range queries. Our empirical results on a number of Prokaryote and Eukaryote DNA contig databases demonstrate up to 50-fold filtration ratio of the search space, up to 13 times faster filtration. The proposed transformation techniques may easily be integrated as a preprocessing phase on top of the current existing similarity search heuristics such as BLAST, PattenHunter, FastA, QUASAR and to efficiently prune non-relevant sequences. We study the precision of applying dimensionality reduction techniques for faster and more efficient range query searches, and discuss the imposed trade-offs.","PeriodicalId":178814,"journal":{"name":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Filtration of string proximity search via transformation\",\"authors\":\"S. Aghili, D. Agrawal, A. E. Abbadi\",\"doi\":\"10.1109/BIBE.2003.1188941\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The problem of proximity search in biological databases is addressed. We study vector transformations and conduct the application of DFT (Discrete Fourier Transformation) and DWT (Discrete Wavelet Transformation, Haar) dimensionality reduction techniques for DNA sequence proximity search to reduce the search time of range queries. Our empirical results on a number of Prokaryote and Eukaryote DNA contig databases demonstrate up to 50-fold filtration ratio of the search space, up to 13 times faster filtration. The proposed transformation techniques may easily be integrated as a preprocessing phase on top of the current existing similarity search heuristics such as BLAST, PattenHunter, FastA, QUASAR and to efficiently prune non-relevant sequences. We study the precision of applying dimensionality reduction techniques for faster and more efficient range query searches, and discuss the imposed trade-offs.\",\"PeriodicalId\":178814,\"journal\":{\"name\":\"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-03-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBE.2003.1188941\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2003.1188941","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

研究了生物数据库中的接近搜索问题。研究向量变换,将离散傅立叶变换(DFT)和离散小波变换(DWT)降维技术应用于DNA序列接近搜索,以减少范围查询的搜索时间。我们在一些原核生物和真核生物DNA序列数据库上的实证结果表明,搜索空间的过滤率高达50倍,过滤速度高达13倍。所提出的转换技术可以很容易地集成在现有的相似搜索启发式算法(如BLAST、PattenHunter、FastA、QUASAR)之上作为预处理阶段,并有效地修剪非相关序列。我们研究了应用降维技术进行更快、更有效的范围查询搜索的精度,并讨论了强加的权衡。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Filtration of string proximity search via transformation
The problem of proximity search in biological databases is addressed. We study vector transformations and conduct the application of DFT (Discrete Fourier Transformation) and DWT (Discrete Wavelet Transformation, Haar) dimensionality reduction techniques for DNA sequence proximity search to reduce the search time of range queries. Our empirical results on a number of Prokaryote and Eukaryote DNA contig databases demonstrate up to 50-fold filtration ratio of the search space, up to 13 times faster filtration. The proposed transformation techniques may easily be integrated as a preprocessing phase on top of the current existing similarity search heuristics such as BLAST, PattenHunter, FastA, QUASAR and to efficiently prune non-relevant sequences. We study the precision of applying dimensionality reduction techniques for faster and more efficient range query searches, and discuss the imposed trade-offs.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信