{"title":"An Improved Quick Algorithm for Aligning DNA/RNA Sequences","authors":"Q. Zou, Maozu Guo, Yang Liu, Taotao Zhang","doi":"10.1109/CIS.WORKSHOPS.2007.58","DOIUrl":null,"url":null,"abstract":"Biology sequence database search is an important problem for bioinformatics research. The main search software BLAST is based on the BYP algorithm. In BYP, if the length of hit region is set long, it may lose similar appearance. Otherwise, there may be too many r-length regions from Aho-Corasik algorithm. Checking where exists any approximate alignments for the DNA or RNA sequence always spends much time, although the expected running time of BYP is linear. A new algorithm is proposed from the viewpoint of shorting the length and increasing the number of the regions, which can make most of the false r-length regions excluded. Thus the algorithm can align the DNA or RNA sequences quickly since the rate of real r- length regions is increased. Experimental results show that the proposed algorithm is computational efficient and achieves good performance.","PeriodicalId":409737,"journal":{"name":"2007 International Conference on Computational Intelligence and Security Workshops (CISW 2007)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 International Conference on Computational Intelligence and Security Workshops (CISW 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIS.WORKSHOPS.2007.58","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Biology sequence database search is an important problem for bioinformatics research. The main search software BLAST is based on the BYP algorithm. In BYP, if the length of hit region is set long, it may lose similar appearance. Otherwise, there may be too many r-length regions from Aho-Corasik algorithm. Checking where exists any approximate alignments for the DNA or RNA sequence always spends much time, although the expected running time of BYP is linear. A new algorithm is proposed from the viewpoint of shorting the length and increasing the number of the regions, which can make most of the false r-length regions excluded. Thus the algorithm can align the DNA or RNA sequences quickly since the rate of real r- length regions is increased. Experimental results show that the proposed algorithm is computational efficient and achieves good performance.