为Nextclade开发禽流感A(H5)病毒数据集可以实现快速和准确的进化支分配。

Jordan T Ort, Sonja A Zolnoski, Tommy T-Y Lam, Richard Neher, Louise H Moncla
{"title":"为Nextclade开发禽流感A(H5)病毒数据集可以实现快速和准确的进化支分配。","authors":"Jordan T Ort, Sonja A Zolnoski, Tommy T-Y Lam, Richard Neher, Louise H Moncla","doi":"10.1101/2025.01.07.631789","DOIUrl":null,"url":null,"abstract":"<p><p>The ongoing panzootic of highly pathogenic avian influenza (HPAI) A(H5) viruses is the largest in history, with unprecedented transmission to multiple mammalian species. Avian influenza A viruses of the H5 subtype circulate globally among birds and are classified into distinct clades based on their hemagglutinin (HA) genetic sequences. Thus, the ability to accurately and rapidly assign clades to newly sequenced isolates is key to surveillance and outbreak response. Co-circulation of endemic, low pathogenic avian influenza (LPAI) A(H5) lineages in North American and European wild birds necessitates the ability to rapidly and accurately distinguish between infections arising from these lineages and epizootic HPAI A(H5) viruses. However, currently available clade assignment tools are limited and often require command line expertise, hindering their utility for public health surveillance labs. To address this gap, we have developed datasets to enable A(H5) clade assignments with Nextclade, a drag-and-drop tool originally developed for SARS-CoV-2 genetic clade classification. Using annotated reference datasets for all historical A(H5) clades, clade 2.3.2.1 descendants, and clade 2.3.4.4 descendants provided by the Food and Agriculture Organization/World Health Organization/World Organisation for Animal Health (FAO/WHO/WOAH) H5 Working Group, we identified clade-defining mutations for every established clade to enable tree-based clade assignment. We then created three Nextclade datasets which can be used to assign clades to A(H5) HA sequences and call mutations relative to reference strains through a drag-and-drop interface. Nextclade assignments were benchmarked with 19,834 unique sequences not in the reference set using a pre-released version of LABEL, a well-validated and widely used command line software. Prospective assignment of new sequences with Nextclade and LABEL produced very well-matched assignments (match rates of 97.8% and 99.1% for the 2.3.2.1 and 2.3.4.4 datasets, respectively). The all-clades dataset also performed well (94.8% match rate) and correctly distinguished between all HPAI and LPAI strains. This tool additionally allows for the identification of polybasic cleavage site sequences and potential N-linked glycosylation sites. These datasets therefore provide an alternative, rapid method to accurately assign clades to new A(H5) HA sequences, with the benefit of an easy-to-use browser interface.</p>","PeriodicalId":519960,"journal":{"name":"bioRxiv : the preprint server for biology","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11741357/pdf/","citationCount":"0","resultStr":"{\"title\":\"Development of avian influenza A(H5) virus datasets for Nextclade enables rapid and accurate clade assignment.\",\"authors\":\"Jordan T Ort, Sonja A Zolnoski, Tommy T-Y Lam, Richard Neher, Louise H Moncla\",\"doi\":\"10.1101/2025.01.07.631789\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The ongoing panzootic of highly pathogenic avian influenza (HPAI) A(H5) viruses is the largest in history, with unprecedented transmission to multiple mammalian species. Avian influenza A viruses of the H5 subtype circulate globally among birds and are classified into distinct clades based on their hemagglutinin (HA) genetic sequences. Thus, the ability to accurately and rapidly assign clades to newly sequenced isolates is key to surveillance and outbreak response. Co-circulation of endemic, low pathogenic avian influenza (LPAI) A(H5) lineages in North American and European wild birds necessitates the ability to rapidly and accurately distinguish between infections arising from these lineages and epizootic HPAI A(H5) viruses. However, currently available clade assignment tools are limited and often require command line expertise, hindering their utility for public health surveillance labs. To address this gap, we have developed datasets to enable A(H5) clade assignments with Nextclade, a drag-and-drop tool originally developed for SARS-CoV-2 genetic clade classification. Using annotated reference datasets for all historical A(H5) clades, clade 2.3.2.1 descendants, and clade 2.3.4.4 descendants provided by the Food and Agriculture Organization/World Health Organization/World Organisation for Animal Health (FAO/WHO/WOAH) H5 Working Group, we identified clade-defining mutations for every established clade to enable tree-based clade assignment. We then created three Nextclade datasets which can be used to assign clades to A(H5) HA sequences and call mutations relative to reference strains through a drag-and-drop interface. Nextclade assignments were benchmarked with 19,834 unique sequences not in the reference set using a pre-released version of LABEL, a well-validated and widely used command line software. Prospective assignment of new sequences with Nextclade and LABEL produced very well-matched assignments (match rates of 97.8% and 99.1% for the 2.3.2.1 and 2.3.4.4 datasets, respectively). The all-clades dataset also performed well (94.8% match rate) and correctly distinguished between all HPAI and LPAI strains. This tool additionally allows for the identification of polybasic cleavage site sequences and potential N-linked glycosylation sites. These datasets therefore provide an alternative, rapid method to accurately assign clades to new A(H5) HA sequences, with the benefit of an easy-to-use browser interface.</p>\",\"PeriodicalId\":519960,\"journal\":{\"name\":\"bioRxiv : the preprint server for biology\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-02-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11741357/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"bioRxiv : the preprint server for biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1101/2025.01.07.631789\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv : the preprint server for biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2025.01.07.631789","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

正在发生的高致病性禽流感(HPAI) A(H5)病毒大流行是历史上规模最大的,在多种哺乳动物中传播的情况前所未有。H5亚型甲型禽流感病毒在全球禽类中传播,并根据其血凝素(HA)基因序列划分为不同的分支。因此,准确和快速地将进化支分配给新测序的分离株的能力是监测和疫情应对的关键。在北美和欧洲野生鸟类中共同流行的地方性低致病性禽流感(LPAI) A(H5)谱系需要能够迅速和准确地区分由这些谱系引起的感染和动物流行性高致病性禽流感A(H5)病毒。然而,目前可用的分支分配工具有限,并且通常需要命令行专业知识,阻碍了它们在公共卫生监测实验室中的应用。为了解决这一差距,我们开发了数据集,以便使用Nextclade进行A(H5)分支分配,Nextclade是最初为SARS-CoV-2遗传分支分类开发的拖放工具。利用粮农组织/世界卫生组织/世界动物卫生组织(FAO/WHO/WOAH) H5工作组提供的所有历史A(H5)进化枝、进化枝2.3.2.1后代和进化枝2.3.4.4后代的注释参考数据集,我们确定了每个已建立进化枝的进化枝定义突变,从而实现基于树的进化枝分配。然后,我们创建了三个Nextclade数据集,可用于将进化支分配给A(H5) HA序列,并通过拖放界面调用相对于参考菌株的突变。Nextclade分配使用LABEL预发布版本(一个经过良好验证且广泛使用的命令行软件)对参考集中没有的19,834个唯一序列进行基准测试。使用Nextclade和LABEL对新序列进行前瞻性赋值,得到了非常好的匹配赋值(2.3.2.1和2.3.4.4数据集的匹配率分别为97.8%和99.1%)。全进化支数据集也表现良好(匹配率为94.8%),并正确区分了所有高pai和低pai菌株。该工具还允许鉴定多碱基切割位点序列和潜在的n链糖基化位点。因此,这些数据集提供了一种替代的、快速的方法来准确地将进化支分配给新的A(H5) HA序列,并具有易于使用的浏览器界面的好处。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Development of avian influenza A(H5) virus datasets for Nextclade enables rapid and accurate clade assignment.

The ongoing panzootic of highly pathogenic avian influenza (HPAI) A(H5) viruses is the largest in history, with unprecedented transmission to multiple mammalian species. Avian influenza A viruses of the H5 subtype circulate globally among birds and are classified into distinct clades based on their hemagglutinin (HA) genetic sequences. Thus, the ability to accurately and rapidly assign clades to newly sequenced isolates is key to surveillance and outbreak response. Co-circulation of endemic, low pathogenic avian influenza (LPAI) A(H5) lineages in North American and European wild birds necessitates the ability to rapidly and accurately distinguish between infections arising from these lineages and epizootic HPAI A(H5) viruses. However, currently available clade assignment tools are limited and often require command line expertise, hindering their utility for public health surveillance labs. To address this gap, we have developed datasets to enable A(H5) clade assignments with Nextclade, a drag-and-drop tool originally developed for SARS-CoV-2 genetic clade classification. Using annotated reference datasets for all historical A(H5) clades, clade 2.3.2.1 descendants, and clade 2.3.4.4 descendants provided by the Food and Agriculture Organization/World Health Organization/World Organisation for Animal Health (FAO/WHO/WOAH) H5 Working Group, we identified clade-defining mutations for every established clade to enable tree-based clade assignment. We then created three Nextclade datasets which can be used to assign clades to A(H5) HA sequences and call mutations relative to reference strains through a drag-and-drop interface. Nextclade assignments were benchmarked with 19,834 unique sequences not in the reference set using a pre-released version of LABEL, a well-validated and widely used command line software. Prospective assignment of new sequences with Nextclade and LABEL produced very well-matched assignments (match rates of 97.8% and 99.1% for the 2.3.2.1 and 2.3.4.4 datasets, respectively). The all-clades dataset also performed well (94.8% match rate) and correctly distinguished between all HPAI and LPAI strains. This tool additionally allows for the identification of polybasic cleavage site sequences and potential N-linked glycosylation sites. These datasets therefore provide an alternative, rapid method to accurately assign clades to new A(H5) HA sequences, with the benefit of an easy-to-use browser interface.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信