DeepSEA：一种免比对的可解释方法来注释抗菌素耐药蛋白。

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics Pub Date : 2025-09-01 DOI:10.1186/s12859-025-06256-4

Tiago Cabral Borelli, Alexandre Rossi Paschoal, Ricardo Roberto da Silva

{"title":"DeepSEA：一种免比对的可解释方法来注释抗菌素耐药蛋白。","authors":"Tiago Cabral Borelli, Alexandre Rossi Paschoal, Ricardo Roberto da Silva","doi":"10.1186/s12859-025-06256-4","DOIUrl":null,"url":null,"abstract":"Antimicrobial resistance (AMR) is one of the most concerning modern threats as it places a greater burden on health systems than HIV and malaria combined. Current surveillance strategies for tracking antimicrobial resistance (AMR) rely on genomic comparisons and depend on sequence alignment with strict similarity cutoffs of greater than 95%. Therefore, these methods have high false-negative error rates due to a lack of reference sequences with a representative coverage of AMR protein diversity. Deep learning has been used as an alternative to sequence alignment, as artificial neural networks can extract abstract features from data, thereby limiting the need for sequence comparisons. Here, a convolutional neural network (CNN) was trained to differentiate between antimicrobial resistance proteins and non-resistance proteins, and to annotate them in nine resistance classes. Our model demonstrated higher recall values (> 0.9) than the alignment-based approach for all protein classes tested. Additionally, our CNN architecture allowed us to investigate internal states and explain the model classification regarding protein domain feature importance related to antimicrobial molecule inactivation. Finally, we built an open-source bioinformatic tool ( https://github.com/computational-chemical-biology/DeepSEA-project ) that can be used to annotate antimicrobial resistance proteins and provide information on protein domains without sequence alignment.","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"224"},"PeriodicalIF":3.3000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12403478/pdf/","citationCount":"0","resultStr":"{\"title\":\"DeepSEA: an alignment-free explainable approach to annotate antimicrobial resistance proteins.\",\"authors\":\"Tiago Cabral Borelli, Alexandre Rossi Paschoal, Ricardo Roberto da Silva\",\"doi\":\"10.1186/s12859-025-06256-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Antimicrobial resistance (AMR) is one of the most concerning modern threats as it places a greater burden on health systems than HIV and malaria combined. Current surveillance strategies for tracking antimicrobial resistance (AMR) rely on genomic comparisons and depend on sequence alignment with strict similarity cutoffs of greater than 95%. Therefore, these methods have high false-negative error rates due to a lack of reference sequences with a representative coverage of AMR protein diversity. Deep learning has been used as an alternative to sequence alignment, as artificial neural networks can extract abstract features from data, thereby limiting the need for sequence comparisons. Here, a convolutional neural network (CNN) was trained to differentiate between antimicrobial resistance proteins and non-resistance proteins, and to annotate them in nine resistance classes. Our model demonstrated higher recall values (> 0.9) than the alignment-based approach for all protein classes tested. Additionally, our CNN architecture allowed us to investigate internal states and explain the model classification regarding protein domain feature importance related to antimicrobial molecule inactivation. Finally, we built an open-source bioinformatic tool ( https://github.com/computational-chemical-biology/DeepSEA-project ) that can be used to annotate antimicrobial resistance proteins and provide information on protein domains without sequence alignment.\",\"PeriodicalId\":8958,\"journal\":{\"name\":\"BMC Bioinformatics\",\"volume\":\"26 1\",\"pages\":\"224\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12403478/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s12859-025-06256-4\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-025-06256-4","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

抗微生物药物耐药性（AMR）是最令人担忧的现代威胁之一，因为它给卫生系统造成的负担比艾滋病毒和疟疾加起来还要大。目前追踪抗菌素耐药性（AMR）的监测策略依赖于基因组比较，并依赖于具有大于95%的严格相似性截止点的序列比对。因此，由于缺乏具有代表性的AMR蛋白多样性覆盖的参考序列，这些方法具有较高的假阴性错误率。深度学习已被用作序列比对的替代方法，因为人工神经网络可以从数据中提取抽象特征，从而限制了对序列比较的需求。在这里，训练卷积神经网络（CNN）来区分抗菌素耐药蛋白和非耐药蛋白，并将它们标注为9个耐药类别。对于所有测试的蛋白质类别，我们的模型比基于比对的方法显示出更高的召回值（> 0.9）。此外，我们的CNN架构允许我们研究内部状态，并解释与抗菌分子失活相关的蛋白质结构域特征重要性的模型分类。最后，我们构建了一个开源的生物信息学工具（https://github.com/computational-chemical-biology/DeepSEA-project），可以用于标注抗菌素耐药蛋白，并提供无需序列比对的蛋白质结构域信息。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

DeepSEA: an alignment-free explainable approach to annotate antimicrobial resistance proteins.

Antimicrobial resistance (AMR) is one of the most concerning modern threats as it places a greater burden on health systems than HIV and malaria combined. Current surveillance strategies for tracking antimicrobial resistance (AMR) rely on genomic comparisons and depend on sequence alignment with strict similarity cutoffs of greater than 95%. Therefore, these methods have high false-negative error rates due to a lack of reference sequences with a representative coverage of AMR protein diversity. Deep learning has been used as an alternative to sequence alignment, as artificial neural networks can extract abstract features from data, thereby limiting the need for sequence comparisons. Here, a convolutional neural network (CNN) was trained to differentiate between antimicrobial resistance proteins and non-resistance proteins, and to annotate them in nine resistance classes. Our model demonstrated higher recall values (> 0.9) than the alignment-based approach for all protein classes tested. Additionally, our CNN architecture allowed us to investigate internal states and explain the model classification regarding protein domain feature importance related to antimicrobial molecule inactivation. Finally, we built an open-source bioinformatic tool ( https://github.com/computational-chemical-biology/DeepSEA-project ) that can be used to annotate antimicrobial resistance proteins and provide information on protein domains without sequence alignment.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BMC Bioinformatics 生物-生化研究方法

CiteScore

5.70

自引率

3.30%

发文量

506

审稿时长

4.3 months

期刊介绍： BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology. BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.