Identification of novel missense mutations in a large number of recent SARS-CoV-2 genome sequences

H. Cai, Kimberly K. Cai, Julang Li
{"title":"Identification of novel missense mutations in a large number of recent SARS-CoV-2 genome sequences","authors":"H. Cai, Kimberly K. Cai, Julang Li","doi":"10.20944/preprints202004.0482.v1","DOIUrl":null,"url":null,"abstract":"\n Background. SARS-CoV-2 infection has spread to over 200 countries since it was first reported in December of 2019. Significant country-specific variations in infection and mortality rate have been noted. Although country-specific differences in public health response have had a large impact on infection rate control, it is currently unclear as to whether evolution of the virus itself has also contributed to variations in infection and mortality rate. Previous studies on SARS-CoV-2 mutations were based on the analysis of ~ 160 SARS-CoV-2 sequences available until mid-February 2020. 2, 3, 4, 5 By mid-April, > 550 SARS-CoV-2 sequences had been deposited in GenBank, and over 8,200 in the GISAID database. Methods. We performed a sequence analysis on 474 SARS-CoV-2 genomes submitted to GenBank up to April 11, 2020 by multiple alignment using Map to a Reference Assembly and Variants/SNP identification. The results were verified on a larger scale, 8,126 hCoV-19 (SARS-CoV-2) sequences from GISAID database. Results. We identified 5 recently emerged mutations in many isolates (up to 40%). Our analysis highlights 5 frequent new mutations that have emerged since late February 2020. These mutations are: one each missense (non-synonymous) mutation in orf1ab (C1059T), orf3 (G25563T) and orf8 (C27964T), one in 5’UTR (C241T), one in a non-coding region (G29553A). The final mutation (G29553A) was found to be almost exclusive to the US isolates. The first 3 mutations are non-synonymous, leading to amino acid substitutions in the viral protein sequence. Except for C241T, all the novel mutations identified are absent in the isolates from Italy and Spain in the SARS-CoV-2 genomes deposited in GenBank and GISAID. Conclusion. The results of current study indicate that new mutations are emerging as COVID-19 pandemic are spreading to different countries and that geography specific mutants exist. The findings of current study lay the foundation for further investigation into the impact of SARS-CoV-2 mutations on disease incidence, severity, and host immune response. In addition, it may also provide insights into vaccine development and serological response detection for the virus.","PeriodicalId":15066,"journal":{"name":"Journal of Biotechnology and Biomedicine","volume":"96 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biotechnology and Biomedicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20944/preprints202004.0482.v1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

Background. SARS-CoV-2 infection has spread to over 200 countries since it was first reported in December of 2019. Significant country-specific variations in infection and mortality rate have been noted. Although country-specific differences in public health response have had a large impact on infection rate control, it is currently unclear as to whether evolution of the virus itself has also contributed to variations in infection and mortality rate. Previous studies on SARS-CoV-2 mutations were based on the analysis of ~ 160 SARS-CoV-2 sequences available until mid-February 2020. 2, 3, 4, 5 By mid-April, > 550 SARS-CoV-2 sequences had been deposited in GenBank, and over 8,200 in the GISAID database. Methods. We performed a sequence analysis on 474 SARS-CoV-2 genomes submitted to GenBank up to April 11, 2020 by multiple alignment using Map to a Reference Assembly and Variants/SNP identification. The results were verified on a larger scale, 8,126 hCoV-19 (SARS-CoV-2) sequences from GISAID database. Results. We identified 5 recently emerged mutations in many isolates (up to 40%). Our analysis highlights 5 frequent new mutations that have emerged since late February 2020. These mutations are: one each missense (non-synonymous) mutation in orf1ab (C1059T), orf3 (G25563T) and orf8 (C27964T), one in 5’UTR (C241T), one in a non-coding region (G29553A). The final mutation (G29553A) was found to be almost exclusive to the US isolates. The first 3 mutations are non-synonymous, leading to amino acid substitutions in the viral protein sequence. Except for C241T, all the novel mutations identified are absent in the isolates from Italy and Spain in the SARS-CoV-2 genomes deposited in GenBank and GISAID. Conclusion. The results of current study indicate that new mutations are emerging as COVID-19 pandemic are spreading to different countries and that geography specific mutants exist. The findings of current study lay the foundation for further investigation into the impact of SARS-CoV-2 mutations on disease incidence, severity, and host immune response. In addition, it may also provide insights into vaccine development and serological response detection for the virus.
在大量最近的SARS-CoV-2基因组序列中鉴定新的错义突变
背景。自2019年12月首次报道以来,SARS-CoV-2感染已蔓延到200多个国家。注意到各国在感染率和死亡率方面存在显著差异。虽然各国在公共卫生应对方面的差异对感染率控制产生了重大影响,但目前尚不清楚病毒本身的进化是否也导致了感染率和死亡率的差异。之前对SARS-CoV-2突变的研究是基于对截至2020年2月中旬的约160个SARS-CoV-2序列的分析。截至4月中旬,已有超过550个SARS-CoV-2序列存入GenBank,超过8200个序列存入GISAID数据库。方法。我们对截至2020年4月11日提交给GenBank的474个SARS-CoV-2基因组进行了序列分析,使用Map to a Reference Assembly和变体/SNP鉴定进行了多次比对。结果在GISAID数据库中的8,126个hCoV-19 (SARS-CoV-2)序列上进行了更大规模的验证。结果。我们在许多分离株中发现了5个最近出现的突变(高达40%)。我们的分析强调了自2020年2月底以来出现的5种频繁的新突变。这些突变是:orf1ab (C1059T), orf3 (G25563T)和orf8 (C27964T)各有一个错义(非同义)突变,一个在5'UTR (C241T),一个在非编码区(G29553A)。最后的突变(G29553A)被发现几乎是美国分离株所独有的。前3个突变是非同义的,导致病毒蛋白序列中的氨基酸替换。除C241T外,在GenBank和GISAID中储存的意大利和西班牙分离株SARS-CoV-2基因组中均不存在鉴定到的所有新突变。结论。目前的研究结果表明,随着COVID-19大流行在不同国家的传播,新的突变正在出现,并且存在地理特异性突变。本研究结果为进一步研究SARS-CoV-2突变对疾病发病率、严重程度和宿主免疫反应的影响奠定了基础。此外,它还可能为该病毒的疫苗开发和血清学反应检测提供见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信