基于序列的卷积神经网络检测甲型流感病毒唾液酸结合偏好预测。

IF 4.3 4区 医学 Q1 INFECTIOUS DISEASES
Laura K. Borkenhagen, Jonathan A. Runstadler
{"title":"基于序列的卷积神经网络检测甲型流感病毒唾液酸结合偏好预测。","authors":"Laura K. Borkenhagen,&nbsp;Jonathan A. Runstadler","doi":"10.1111/irv.70044","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Though receptor binding specificity is well established as a contributor to host tropism and spillover potential of influenza A viruses, determining receptor binding preference of a specific virus still requires expensive and time-consuming laboratory analyses. In this study, we pilot a machine learning approach for prediction of binding preference.</p>\n </section>\n \n <section>\n \n <h3> Methods</h3>\n \n <p>We trained a convolutional neural network to predict the α2,6-linked sialic acid preference of influenza A viruses given the hemagglutinin amino acid sequence. The model was evaluated with an independent test dataset to assess the standard performance metrics, the impact of missing data in the test sequences, and the prediction performance on novel subtypes. Further, features found to be important to the generation of predictions were tested via targeted mutagenesis of H9 and H16 proteins expressed on pseudoviruses.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>The final model developed in this study produced predictions on a test dataset correctly 94% of the time and an area under the receiver operating characteristic curve of 0.93. The model tolerated about 10% missing test data without compromising accurate prediction performance. Predictions on novel subtypes revealed that the model can extrapolate feature relationships between subtypes when generating binding predictions. Finally, evaluation of the features important for model predictions helped identify positions that alter the sialic acid conformation preference of hemagglutinin proteins in practice.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>Ultimately, our results provide support to this in silico approach to hemagglutinin receptor binding preference prediction. This work emphasizes the need for ongoing research efforts to produce tools that may aid future pandemic risk assessment.</p>\n </section>\n </div>","PeriodicalId":13544,"journal":{"name":"Influenza and Other Respiratory Viruses","volume":"18 12","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2024-12-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11634464/pdf/","citationCount":"0","resultStr":"{\"title\":\"Examining the Influenza A Virus Sialic Acid Binding Preference Predictions of a Sequence-Based Convolutional Neural Network\",\"authors\":\"Laura K. Borkenhagen,&nbsp;Jonathan A. Runstadler\",\"doi\":\"10.1111/irv.70044\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div>\\n \\n \\n <section>\\n \\n <h3> Background</h3>\\n \\n <p>Though receptor binding specificity is well established as a contributor to host tropism and spillover potential of influenza A viruses, determining receptor binding preference of a specific virus still requires expensive and time-consuming laboratory analyses. In this study, we pilot a machine learning approach for prediction of binding preference.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Methods</h3>\\n \\n <p>We trained a convolutional neural network to predict the α2,6-linked sialic acid preference of influenza A viruses given the hemagglutinin amino acid sequence. The model was evaluated with an independent test dataset to assess the standard performance metrics, the impact of missing data in the test sequences, and the prediction performance on novel subtypes. Further, features found to be important to the generation of predictions were tested via targeted mutagenesis of H9 and H16 proteins expressed on pseudoviruses.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Results</h3>\\n \\n <p>The final model developed in this study produced predictions on a test dataset correctly 94% of the time and an area under the receiver operating characteristic curve of 0.93. The model tolerated about 10% missing test data without compromising accurate prediction performance. Predictions on novel subtypes revealed that the model can extrapolate feature relationships between subtypes when generating binding predictions. Finally, evaluation of the features important for model predictions helped identify positions that alter the sialic acid conformation preference of hemagglutinin proteins in practice.</p>\\n </section>\\n \\n <section>\\n \\n <h3> Conclusions</h3>\\n \\n <p>Ultimately, our results provide support to this in silico approach to hemagglutinin receptor binding preference prediction. This work emphasizes the need for ongoing research efforts to produce tools that may aid future pandemic risk assessment.</p>\\n </section>\\n </div>\",\"PeriodicalId\":13544,\"journal\":{\"name\":\"Influenza and Other Respiratory Viruses\",\"volume\":\"18 12\",\"pages\":\"\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2024-12-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11634464/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Influenza and Other Respiratory Viruses\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1111/irv.70044\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"INFECTIOUS DISEASES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Influenza and Other Respiratory Viruses","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/irv.70044","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INFECTIOUS DISEASES","Score":null,"Total":0}
引用次数: 0

摘要

背景:虽然受体结合特异性是甲型流感病毒宿主趋向性和溢出潜力的一个重要因素,但确定特定病毒的受体结合偏好仍然需要昂贵且耗时的实验室分析。在这项研究中,我们尝试了一种机器学习方法来预测绑定偏好。方法:根据血凝素氨基酸序列,训练卷积神经网络预测甲型流感病毒对α2,6-链唾液酸的偏好。使用独立的测试数据集对模型进行评估,以评估标准性能指标、测试序列中缺失数据的影响以及对新亚型的预测性能。此外,通过靶向诱变假病毒上表达的H9和H16蛋白,对发现的对预测产生重要的特征进行了测试。结果:本研究开发的最终模型在测试数据集上产生预测的正确率为94%,接受者工作特征曲线下的面积为0.93。该模型在不影响准确预测性能的情况下容忍大约10%的测试数据缺失。对新亚型的预测表明,该模型可以在生成绑定预测时推断亚型之间的特征关系。最后,对模型预测的重要特征的评估有助于确定在实践中改变血凝素蛋白唾液酸构象偏好的位置。结论:最终,我们的结果为这种预测血凝素受体结合偏好的计算机方法提供了支持。这项工作强调需要进行持续的研究工作,以产生可能有助于未来大流行风险评估的工具。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Examining the Influenza A Virus Sialic Acid Binding Preference Predictions of a Sequence-Based Convolutional Neural Network

Examining the Influenza A Virus Sialic Acid Binding Preference Predictions of a Sequence-Based Convolutional Neural Network

Background

Though receptor binding specificity is well established as a contributor to host tropism and spillover potential of influenza A viruses, determining receptor binding preference of a specific virus still requires expensive and time-consuming laboratory analyses. In this study, we pilot a machine learning approach for prediction of binding preference.

Methods

We trained a convolutional neural network to predict the α2,6-linked sialic acid preference of influenza A viruses given the hemagglutinin amino acid sequence. The model was evaluated with an independent test dataset to assess the standard performance metrics, the impact of missing data in the test sequences, and the prediction performance on novel subtypes. Further, features found to be important to the generation of predictions were tested via targeted mutagenesis of H9 and H16 proteins expressed on pseudoviruses.

Results

The final model developed in this study produced predictions on a test dataset correctly 94% of the time and an area under the receiver operating characteristic curve of 0.93. The model tolerated about 10% missing test data without compromising accurate prediction performance. Predictions on novel subtypes revealed that the model can extrapolate feature relationships between subtypes when generating binding predictions. Finally, evaluation of the features important for model predictions helped identify positions that alter the sialic acid conformation preference of hemagglutinin proteins in practice.

Conclusions

Ultimately, our results provide support to this in silico approach to hemagglutinin receptor binding preference prediction. This work emphasizes the need for ongoing research efforts to produce tools that may aid future pandemic risk assessment.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
7.20
自引率
4.50%
发文量
120
审稿时长
6-12 weeks
期刊介绍: Influenza and Other Respiratory Viruses is the official journal of the International Society of Influenza and Other Respiratory Virus Diseases - an independent scientific professional society - dedicated to promoting the prevention, detection, treatment, and control of influenza and other respiratory virus diseases. Influenza and Other Respiratory Viruses is an Open Access journal. Copyright on any research article published by Influenza and Other Respiratory Viruses is retained by the author(s). Authors grant Wiley a license to publish the article and identify itself as the original publisher. Authors also grant any third party the right to use the article freely as long as its integrity is maintained and its original authors, citation details and publisher are identified.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信