Features that matter: Evolutionary signatures can predict viral transmission routes.

IF 5.5 1区 医学 Q1 MICROBIOLOGY
PLoS Pathogens Pub Date : 2024-10-21 eCollection Date: 2024-10-01 DOI:10.1371/journal.ppat.1012629
Maya Wardeh, Jack Pilgrim, Melody Hui, Aurelia Kotsiri, Matthew Baylis, Marcus S C Blagrove
{"title":"Features that matter: Evolutionary signatures can predict viral transmission routes.","authors":"Maya Wardeh, Jack Pilgrim, Melody Hui, Aurelia Kotsiri, Matthew Baylis, Marcus S C Blagrove","doi":"10.1371/journal.ppat.1012629","DOIUrl":null,"url":null,"abstract":"<p><p>Routes of virus transmission between hosts are key to understanding viral epidemiology. Different routes have large effects on viral ecology, and likelihood and rate of transmission; for example, respiratory and vector-borne viruses together encompass the majority of rapid outbreaks and high-consequence animal and plant epidemics. However, determining the specific transmission route(s) can take months to years, delaying mitigation efforts. Here, we identify the viral features and evolutionary signatures which are predictive of viral transmission routes and use them to predict potential routes for fully-sequenced viruses in silico and rapidly, for both viruses with no observed routes, as well as viruses with missing routes. This was achieved by compiling a dataset of 24,953 virus-host associations with 81 defined transmission routes, constructing a hierarchy of virus transmission encompassing those routes and 42 higher-order modes, and engineering 446 predictive features from three complementary perspectives. We integrated those data and features to train 98 independent ensembles of LightGBM classifiers. We found that all features contributed to the prediction for at least one of the routes and/or modes of transmission, demonstrating the utility of our broad multi-perspective approach. Our framework achieved ROC-AUC = 0.991, and F1-score = 0.855 across all included transmission routes and modes, and was able to achieve high levels of predictive performance for high-consequence respiratory (ROC-AUC = 0.990, and F1-score = 0.864) and vector-borne transmission (ROC-AUC = 0.997, and F1-score = 0.921). Our framework ranks the viral features in order of their contribution to prediction, per transmission route, and hence identifies the genomic evolutionary signatures associated with each route. Together with the more matured field of viral host-range prediction, our predictive framework could: provide early insights into the potential for, and pattern of viral spread; facilitate rapid response with appropriate measures; and significantly triage the time-consuming investigations to confirm the likely routes of transmission.</p>","PeriodicalId":48999,"journal":{"name":"PLoS Pathogens","volume":null,"pages":null},"PeriodicalIF":5.5000,"publicationDate":"2024-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11527288/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PLoS Pathogens","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1371/journal.ppat.1012629","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"MICROBIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

Routes of virus transmission between hosts are key to understanding viral epidemiology. Different routes have large effects on viral ecology, and likelihood and rate of transmission; for example, respiratory and vector-borne viruses together encompass the majority of rapid outbreaks and high-consequence animal and plant epidemics. However, determining the specific transmission route(s) can take months to years, delaying mitigation efforts. Here, we identify the viral features and evolutionary signatures which are predictive of viral transmission routes and use them to predict potential routes for fully-sequenced viruses in silico and rapidly, for both viruses with no observed routes, as well as viruses with missing routes. This was achieved by compiling a dataset of 24,953 virus-host associations with 81 defined transmission routes, constructing a hierarchy of virus transmission encompassing those routes and 42 higher-order modes, and engineering 446 predictive features from three complementary perspectives. We integrated those data and features to train 98 independent ensembles of LightGBM classifiers. We found that all features contributed to the prediction for at least one of the routes and/or modes of transmission, demonstrating the utility of our broad multi-perspective approach. Our framework achieved ROC-AUC = 0.991, and F1-score = 0.855 across all included transmission routes and modes, and was able to achieve high levels of predictive performance for high-consequence respiratory (ROC-AUC = 0.990, and F1-score = 0.864) and vector-borne transmission (ROC-AUC = 0.997, and F1-score = 0.921). Our framework ranks the viral features in order of their contribution to prediction, per transmission route, and hence identifies the genomic evolutionary signatures associated with each route. Together with the more matured field of viral host-range prediction, our predictive framework could: provide early insights into the potential for, and pattern of viral spread; facilitate rapid response with appropriate measures; and significantly triage the time-consuming investigations to confirm the likely routes of transmission.

重要的特征:进化特征可预测病毒传播路线
病毒在宿主之间的传播途径是了解病毒流行病学的关键。不同的传播途径对病毒生态学、传播的可能性和速度有很大的影响;例如,呼吸道病毒和病媒传播病毒共同构成了大多数快速爆发和后果严重的动植物疫情。然而,确定具体的传播途径可能需要数月至数年的时间,从而延误减灾工作。在本文中,我们确定了可预测病毒传播途径的病毒瓶特征和进化特征,并利用这些特征和进化特征来快速预测全序列病毒的潜在传播途径,既包括未观察到传播途径的病毒,也包括传播途径缺失的病毒。为此,我们汇编了一个包含 24953 种病毒-宿主关联的数据集,其中有 81 种确定的传播途径,构建了一个包含这些途径和 42 种高阶模式的病毒传播层次结构,并从三个互补的角度设计了 446 个预测特征。我们整合了这些数据和特征,训练了 98 个独立的 LightGBM 分类器集合。我们发现,所有特征都有助于预测至少一种传播途径和/或模式,这证明了我们广泛的多角度方法的实用性。我们的框架在所有包括的传播途径和模式中都达到了 ROC-AUC = 0.991 和 F1-score = 0.855,并能对高后果呼吸道传播(ROC-AUC = 0.990 和 F1-score = 0.864)和病媒传播(ROC-AUC = 0.997 和 F1-score = 0.921)实现高水平的预测性能。我们的框架根据病毒特征对每种传播途径预测的贡献大小对其进行了排序,从而确定了与每种途径相关的基因组进化特征。与更成熟的病毒宿主范围预测领域相结合,我们的预测框架可以:及早洞察病毒传播的潜力和模式;促进采取适当措施做出快速反应;大大减少为确认可能的传播途径而进行的耗时调查。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
PLoS Pathogens
PLoS Pathogens MICROBIOLOGY-PARASITOLOGY
自引率
3.00%
发文量
598
期刊介绍: Bacteria, fungi, parasites, prions and viruses cause a plethora of diseases that have important medical, agricultural, and economic consequences. Moreover, the study of microbes continues to provide novel insights into such fundamental processes as the molecular basis of cellular and organismal function.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信