Robin Jugas, K. Sedlář, Martin Vítek, Helena Skutková
{"title":"Overlap Detection for a Genome Assembly Based on Genomic Signal Processing","authors":"Robin Jugas, K. Sedlář, Martin Vítek, Helena Skutková","doi":"10.1109/CBMS.2017.140","DOIUrl":null,"url":null,"abstract":"Although the genome sequences of most studied organisms, like human, E. coli, and others are already known, de novo genome sequencing remains popular as a majority of genomes remains unknown. Unfortunately, sequencing machines are able to read only short fragments of DNA. Therefore, one of the basic steps in reconstructing novel genomes lies in putting these pieces of DNA, called reads, together into complete genome sequences using a process known as genome assembly. Reads joining, however, requires efficient detection of their overlaps. This is commonly performed by comparing the particular characters (A, C, G, T) of the reads using string processing techniques. In this paper, we present an alternative way of detecting overlaps using genomic signal processing. Unlike string comparison, numerical phase signals reflect the complementarity of double stranded DNA making the signal ideal for effective strand independent overlap detection using covariance with high accuracy.","PeriodicalId":141105,"journal":{"name":"2017 IEEE 30th International Symposium on Computer-Based Medical Systems (CBMS)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 30th International Symposium on Computer-Based Medical Systems (CBMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBMS.2017.140","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Although the genome sequences of most studied organisms, like human, E. coli, and others are already known, de novo genome sequencing remains popular as a majority of genomes remains unknown. Unfortunately, sequencing machines are able to read only short fragments of DNA. Therefore, one of the basic steps in reconstructing novel genomes lies in putting these pieces of DNA, called reads, together into complete genome sequences using a process known as genome assembly. Reads joining, however, requires efficient detection of their overlaps. This is commonly performed by comparing the particular characters (A, C, G, T) of the reads using string processing techniques. In this paper, we present an alternative way of detecting overlaps using genomic signal processing. Unlike string comparison, numerical phase signals reflect the complementarity of double stranded DNA making the signal ideal for effective strand independent overlap detection using covariance with high accuracy.