Propagation research of SARS-CoV-2 based on evolutionary tree and spectral clustering

Jingyi Yan, Yong Cao, Naien Wu, Fei Xiong, Yongke Sun, Cong Zhong
{"title":"Propagation research of SARS-CoV-2 based on evolutionary tree and spectral clustering","authors":"Jingyi Yan, Yong Cao, Naien Wu, Fei Xiong, Yongke Sun, Cong Zhong","doi":"10.1109/CACML55074.2022.00106","DOIUrl":null,"url":null,"abstract":"RNA viruses have the characteristics of a high mutation rate. New Coronavirus (SARS-CoV-2), as a RNA virus, has been mutated to some extent since the outbreak of New Coronavirus pneumonia (COVID-19). It is of great significance to study the evolution and variation of novel coronavirus genes to analyze the source of virus infection and understand the evolution of viruses. This research is based on the Novel Coronavirus 2019 database at the National Genomics Data Center. We combined macro and micro. We used the phylogenetic tree to analyze the gene fragments of the virus, constructed an evolutionary tree with a depth of 301, searched the root node of the tree to find the source of the virus in the data set and used spectral clustering to analyze the degree of novel Coronavirus variation in each country and the clustering results were visualized to make them easier to observe. The experimental results show that the strain sample at the top of the evolutionary tree originated in New Zealand based on the existing data. In the evolutionary tree, the evolutionary process of the virus can be divided into three branches. After clustering the virus source data and constructing the visual map of the variation degree of SARS-COV-2, we found that the viruses in South Africa, New Zealand and other countries had a higher degree of variation, and the viruses in Australia, the United States and other countries have a relatively lower degree of virus variation.","PeriodicalId":137505,"journal":{"name":"2022 Asia Conference on Algorithms, Computing and Machine Learning (CACML)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Asia Conference on Algorithms, Computing and Machine Learning (CACML)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CACML55074.2022.00106","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

RNA viruses have the characteristics of a high mutation rate. New Coronavirus (SARS-CoV-2), as a RNA virus, has been mutated to some extent since the outbreak of New Coronavirus pneumonia (COVID-19). It is of great significance to study the evolution and variation of novel coronavirus genes to analyze the source of virus infection and understand the evolution of viruses. This research is based on the Novel Coronavirus 2019 database at the National Genomics Data Center. We combined macro and micro. We used the phylogenetic tree to analyze the gene fragments of the virus, constructed an evolutionary tree with a depth of 301, searched the root node of the tree to find the source of the virus in the data set and used spectral clustering to analyze the degree of novel Coronavirus variation in each country and the clustering results were visualized to make them easier to observe. The experimental results show that the strain sample at the top of the evolutionary tree originated in New Zealand based on the existing data. In the evolutionary tree, the evolutionary process of the virus can be divided into three branches. After clustering the virus source data and constructing the visual map of the variation degree of SARS-COV-2, we found that the viruses in South Africa, New Zealand and other countries had a higher degree of variation, and the viruses in Australia, the United States and other countries have a relatively lower degree of virus variation.
基于进化树和谱聚类的SARS-CoV-2传播研究
RNA病毒具有突变率高的特点。新型冠状病毒(SARS-CoV-2)作为一种RNA病毒,在新型冠状病毒肺炎(COVID-19)爆发后发生了一定程度的变异。研究新型冠状病毒基因的进化和变异,对分析病毒感染源、了解病毒进化具有重要意义。这项研究基于国家基因组学数据中心的2019年新型冠状病毒数据库。我们把宏观和微观结合起来。我们利用系统发育树对病毒的基因片段进行分析,构建了深度为301的进化树,在数据集中搜索树的根节点寻找病毒的来源,并利用谱聚类分析各国新型冠状病毒的变异程度,并将聚类结果可视化,便于观察。实验结果表明,基于现有数据,进化树顶端的菌株样本起源于新西兰。在进化树中,病毒的进化过程可以分为三个分支。对病毒源数据进行聚类,构建SARS-COV-2变异程度可视化图谱后,我们发现南非、新西兰等国家的病毒变异程度较高,澳大利亚、美国等国家的病毒变异程度相对较低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信