A Novel Method for Alignment-free DNA Sequence Similarity Analysis Based on the Characterization of Complex Networks

Jie Zhou, Pianyu Zhong, Tinghui Zhang
{"title":"A Novel Method for Alignment-free DNA Sequence Similarity Analysis Based on the Characterization of Complex Networks","authors":"Jie Zhou, Pianyu Zhong, Tinghui Zhang","doi":"10.4137/EBO.S40474","DOIUrl":null,"url":null,"abstract":"Determination of sequence similarity is one of the major steps in computational phylogenetic studies. One of the major tasks of computational biologists is to develop novel mathematical descriptors for similarity analysis. DNA clustering is an important technology that automatically identifies inherent relationships among large-scale DNA sequences. The comparison between the DNA sequences of different species helps determine phylogenetic relationships among species. Alignment-free approaches have continuously gained interest in various sequence analysis applications such as phylogenetic inference and metagenomic classification/clustering, particularly for large-scale sequence datasets. Here, we construct a novel and simple mathematical descriptor based on the characterization of cis sequence complex DNA networks. This new approach is based on a code of three cis nucleotides in a gene that could code for an amino acid. In particular, for each DNA sequence, we will set up a cis sequence complex network that will be used to develop a characterization vector for the analysis of mitochondrial DNA sequence phylogenetic relationships among nine species. The resulting phylogenetic relationships among the nine species were determined to be in agreement with the actual situation.","PeriodicalId":136690,"journal":{"name":"Evolutionary Bioinformatics Online","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Evolutionary Bioinformatics Online","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4137/EBO.S40474","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

Abstract

Determination of sequence similarity is one of the major steps in computational phylogenetic studies. One of the major tasks of computational biologists is to develop novel mathematical descriptors for similarity analysis. DNA clustering is an important technology that automatically identifies inherent relationships among large-scale DNA sequences. The comparison between the DNA sequences of different species helps determine phylogenetic relationships among species. Alignment-free approaches have continuously gained interest in various sequence analysis applications such as phylogenetic inference and metagenomic classification/clustering, particularly for large-scale sequence datasets. Here, we construct a novel and simple mathematical descriptor based on the characterization of cis sequence complex DNA networks. This new approach is based on a code of three cis nucleotides in a gene that could code for an amino acid. In particular, for each DNA sequence, we will set up a cis sequence complex network that will be used to develop a characterization vector for the analysis of mitochondrial DNA sequence phylogenetic relationships among nine species. The resulting phylogenetic relationships among the nine species were determined to be in agreement with the actual situation.
基于复杂网络特征的无比对DNA序列相似性分析新方法
序列相似性的确定是计算系统发育研究的主要步骤之一。计算生物学家的主要任务之一是为相似性分析开发新的数学描述符。DNA聚类是一种自动识别大规模DNA序列间内在关系的重要技术。不同物种之间DNA序列的比较有助于确定物种之间的系统发育关系。无比对方法在各种序列分析应用中不断获得兴趣,如系统发育推断和宏基因组分类/聚类,特别是对于大规模序列数据集。在这里,我们基于顺式序列复杂DNA网络的特征构建了一个新颖而简单的数学描述符。这种新方法是基于一个基因中三个顺式核苷酸的编码,这个基因可以编码一个氨基酸。特别是,对于每个DNA序列,我们将建立一个顺式序列复杂网络,该网络将用于开发表征载体,用于分析9个物种之间的线粒体DNA序列系统发育关系。结果表明,9种植物之间的系统发育关系与实际情况一致。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信