Overlap Detection for a Genome Assembly Based on Genomic Signal Processing

Robin Jugas, K. Sedlář, Martin Vítek, Helena Skutková
{"title":"Overlap Detection for a Genome Assembly Based on Genomic Signal Processing","authors":"Robin Jugas, K. Sedlář, Martin Vítek, Helena Skutková","doi":"10.1109/CBMS.2017.140","DOIUrl":null,"url":null,"abstract":"Although the genome sequences of most studied organisms, like human, E. coli, and others are already known, de novo genome sequencing remains popular as a majority of genomes remains unknown. Unfortunately, sequencing machines are able to read only short fragments of DNA. Therefore, one of the basic steps in reconstructing novel genomes lies in putting these pieces of DNA, called reads, together into complete genome sequences using a process known as genome assembly. Reads joining, however, requires efficient detection of their overlaps. This is commonly performed by comparing the particular characters (A, C, G, T) of the reads using string processing techniques. In this paper, we present an alternative way of detecting overlaps using genomic signal processing. Unlike string comparison, numerical phase signals reflect the complementarity of double stranded DNA making the signal ideal for effective strand independent overlap detection using covariance with high accuracy.","PeriodicalId":141105,"journal":{"name":"2017 IEEE 30th International Symposium on Computer-Based Medical Systems (CBMS)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 30th International Symposium on Computer-Based Medical Systems (CBMS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CBMS.2017.140","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Although the genome sequences of most studied organisms, like human, E. coli, and others are already known, de novo genome sequencing remains popular as a majority of genomes remains unknown. Unfortunately, sequencing machines are able to read only short fragments of DNA. Therefore, one of the basic steps in reconstructing novel genomes lies in putting these pieces of DNA, called reads, together into complete genome sequences using a process known as genome assembly. Reads joining, however, requires efficient detection of their overlaps. This is commonly performed by comparing the particular characters (A, C, G, T) of the reads using string processing techniques. In this paper, we present an alternative way of detecting overlaps using genomic signal processing. Unlike string comparison, numerical phase signals reflect the complementarity of double stranded DNA making the signal ideal for effective strand independent overlap detection using covariance with high accuracy.
基于基因组信号处理的基因组序列重叠检测
虽然大多数被研究的生物,如人类、大肠杆菌和其他生物的基因组序列已经已知,但由于大多数基因组仍然未知,从头开始的基因组测序仍然很流行。不幸的是,测序机只能读取DNA的短片段。因此,重建新基因组的一个基本步骤是将这些DNA片段(称为reads)组合成完整的基因组序列,这一过程被称为基因组组装。然而,读取连接需要有效地检测它们的重叠。这通常是通过使用字符串处理技术比较读取的特定字符(A、C、G、T)来实现的。在本文中,我们提出了一种使用基因组信号处理检测重叠的替代方法。与字符串比较不同,数值相位信号反映了双链DNA的互补性,使得该信号非常适合使用协方差进行有效的链独立重叠检测,具有很高的准确性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信