Mutation prediction in the SARS-CoV-2 genome using attention-based neural machine translation.

IF 2.6 4区 工程技术 Q1 Mathematics
Darrak Moin Quddusi, Sandesh Athni Hiremath, Naim Bajcinca
{"title":"Mutation prediction in the SARS-CoV-2 genome using attention-based neural machine translation.","authors":"Darrak Moin Quddusi, Sandesh Athni Hiremath, Naim Bajcinca","doi":"10.3934/mbe.2024264","DOIUrl":null,"url":null,"abstract":"<p><p>Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) has been evolving rapidly after causing havoc worldwide in 2020. Since then, it has been very hard to contain the virus owing to its frequently mutating nature. Changes in its genome lead to viral evolution, rendering it more resistant to existing vaccines and drugs. Predicting viral mutations beforehand will help in gearing up against more infectious and virulent versions of the virus in turn decreasing the damage caused by them. In this paper, we have proposed different NMT (neural machine translation) architectures based on RNNs (recurrent neural networks) to predict mutations in the SARS-CoV-2-selected non-structural proteins (NSP), i.e., NSP1, NSP3, NSP5, NSP8, NSP9, NSP13, and NSP15. First, we created and pre-processed the pairs of sequences from two languages using k-means clustering and nearest neighbors for training a neural translation machine. We also provided insights for training NMTs on long biological sequences. In addition, we evaluated and benchmarked our models to demonstrate their efficiency and reliability.</p>","PeriodicalId":49870,"journal":{"name":"Mathematical Biosciences and Engineering","volume":"21 5","pages":"5996-6018"},"PeriodicalIF":2.6000,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematical Biosciences and Engineering","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.3934/mbe.2024264","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0

Abstract

Severe acute respiratory syndrome coronavirus 2 (SARS‑CoV‑2) has been evolving rapidly after causing havoc worldwide in 2020. Since then, it has been very hard to contain the virus owing to its frequently mutating nature. Changes in its genome lead to viral evolution, rendering it more resistant to existing vaccines and drugs. Predicting viral mutations beforehand will help in gearing up against more infectious and virulent versions of the virus in turn decreasing the damage caused by them. In this paper, we have proposed different NMT (neural machine translation) architectures based on RNNs (recurrent neural networks) to predict mutations in the SARS-CoV-2-selected non-structural proteins (NSP), i.e., NSP1, NSP3, NSP5, NSP8, NSP9, NSP13, and NSP15. First, we created and pre-processed the pairs of sequences from two languages using k-means clustering and nearest neighbors for training a neural translation machine. We also provided insights for training NMTs on long biological sequences. In addition, we evaluated and benchmarked our models to demonstrate their efficiency and reliability.

利用基于注意力的神经机器翻译预测 SARS-CoV-2 基因组中的突变。
严重急性呼吸系统综合症冠状病毒 2(SARS-CoV-2)自 2020 年在全球范围内造成严重破坏后,一直在迅速演变。从那时起,由于该病毒频繁变异,一直很难对其进行控制。病毒基因组的变化导致病毒进化,使其对现有疫苗和药物更具抵抗力。提前预测病毒变异将有助于应对更具传染性和毒性的病毒版本,从而减少病毒造成的损害。在本文中,我们提出了基于 RNN(递归神经网络)的不同 NMT(神经机器翻译)架构,用于预测 SARS-CoV-2 选定的非结构蛋白(NSP),即 NSP1、NSP3、NSP5、NSP8、NSP9、NSP13 和 NSP15 的突变。首先,我们使用 k-means 聚类和近邻法创建并预处理了来自两种语言的序列对,用于训练神经翻译机。我们还为在长生物序列上训练神经翻译机提供了见解。此外,我们还对模型进行了评估和基准测试,以证明其效率和可靠性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Mathematical Biosciences and Engineering
Mathematical Biosciences and Engineering 工程技术-数学跨学科应用
CiteScore
3.90
自引率
7.70%
发文量
586
审稿时长
>12 weeks
期刊介绍: Mathematical Biosciences and Engineering (MBE) is an interdisciplinary Open Access journal promoting cutting-edge research, technology transfer and knowledge translation about complex data and information processing. MBE publishes Research articles (long and original research); Communications (short and novel research); Expository papers; Technology Transfer and Knowledge Translation reports (description of new technologies and products); Announcements and Industrial Progress and News (announcements and even advertisement, including major conferences).
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信