分析和探索图注意网络和基于蛋白质的语言模型预测牙龈卟啉单胞菌耐药外排蛋白序列。

IF 2.7 Q1 DENTISTRY, ORAL SURGERY & MEDICINE

Dental and Medical Problems Pub Date : 2025-03-01 DOI:10.17219/dmp/186143

Pradeep Kumar Yadalam, Prabhu Manickam Natarajan, Naresh Shetty, Maria Maddalena Marrapodi, Hande Uzunçıbuk, Diana Russo, Marco Cicciù, Giuseppe Minervini

{"title":"分析和探索图注意网络和基于蛋白质的语言模型预测牙龈卟啉单胞菌耐药外排蛋白序列。","authors":"Pradeep Kumar Yadalam, Prabhu Manickam Natarajan, Naresh Shetty, Maria Maddalena Marrapodi, Hande Uzunçıbuk, Diana Russo, Marco Cicciù, Giuseppe Minervini","doi":"10.17219/dmp/186143","DOIUrl":null,"url":null,"abstract":"Background: Antimicrobial resistance (AMR) must be predicted to combat antibiotic-resistant illnesses. Based on high-priority AMR genomes, it is possible to track resistance and focus treatment to stop global outbreaks. Large language models (LLMs) are essential for identifying Porhyromonas gingivalis multiresistant efflux genes to prevent resistance. Antibiotic resistance is a serious problem; however, by studying specific bacterial genomes, we can predict how resistance develops and find better kinds of treatment.Objectives: This paper explores using advanced models to predict the sequences of proteins that make P. gingivalis resistant to treatment. Understanding this approach could help prevent AMR more effectively.Material and methods: This research utilized multi-drug-resistant efflux protein sequences from P. gingivalis, identified through UniProt ID A0A0K2J2N6_PORGN, and formatted as FASTA sequences for analysis. These sequences underwent rigorous detection and quality assurance processes to ensure their suitability for computational analysis. The study employed the DeepBIO framework, which integrates LLMs with deep attention networks to process FASTA sequences.Results: The analysis revealed that the Long Short-Term Memory (LSTM)-attention, ProtBERT and BERTGAT models achieved sensitivity scores of 0.9 across the board, with accuracy rates of 89.5%, 88.5% and 90.5%, respectively. These results highlight the effectiveness of the models in identifying P. gingivalis strains resistant to multiple drugs. Furthermore, the study assessed the specificity of the LSTM-attention, ProtBERT and BERTGAT models, which achieved scores of 0.89, 0.87 and 0.90, respectively. Specificity, or the genuine negative rate, measures the ability of a model to accurately identify non-resistant cases, which is crucial for minimizing false positives in AMR detection.Conclusions: When utilized clinically, this LLM approach will help prevent AMR, which is a global problem. Understanding this approach may enable researchers to develop more effective treatment strategies that target specific resistant genes, reducing the likelihood of resistance development. Ultimately, this approach could play a pivotal role in preventing AMR on a global scale.","PeriodicalId":11191,"journal":{"name":"Dental and Medical Problems","volume":"62 2","pages":"265-273"},"PeriodicalIF":2.7000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Analyzing and exploring Graph Attention Networks and protein-based language models for predicting Porhyromonas gingivalis resistant efflux protein sequences.\",\"authors\":\"Pradeep Kumar Yadalam, Prabhu Manickam Natarajan, Naresh Shetty, Maria Maddalena Marrapodi, Hande Uzunçıbuk, Diana Russo, Marco Cicciù, Giuseppe Minervini\",\"doi\":\"10.17219/dmp/186143\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Antimicrobial resistance (AMR) must be predicted to combat antibiotic-resistant illnesses. Based on high-priority AMR genomes, it is possible to track resistance and focus treatment to stop global outbreaks. Large language models (LLMs) are essential for identifying Porhyromonas gingivalis multiresistant efflux genes to prevent resistance. Antibiotic resistance is a serious problem; however, by studying specific bacterial genomes, we can predict how resistance develops and find better kinds of treatment.Objectives: This paper explores using advanced models to predict the sequences of proteins that make P. gingivalis resistant to treatment. Understanding this approach could help prevent AMR more effectively.Material and methods: This research utilized multi-drug-resistant efflux protein sequences from P. gingivalis, identified through UniProt ID A0A0K2J2N6_PORGN, and formatted as FASTA sequences for analysis. These sequences underwent rigorous detection and quality assurance processes to ensure their suitability for computational analysis. The study employed the DeepBIO framework, which integrates LLMs with deep attention networks to process FASTA sequences.Results: The analysis revealed that the Long Short-Term Memory (LSTM)-attention, ProtBERT and BERTGAT models achieved sensitivity scores of 0.9 across the board, with accuracy rates of 89.5%, 88.5% and 90.5%, respectively. These results highlight the effectiveness of the models in identifying P. gingivalis strains resistant to multiple drugs. Furthermore, the study assessed the specificity of the LSTM-attention, ProtBERT and BERTGAT models, which achieved scores of 0.89, 0.87 and 0.90, respectively. Specificity, or the genuine negative rate, measures the ability of a model to accurately identify non-resistant cases, which is crucial for minimizing false positives in AMR detection.Conclusions: When utilized clinically, this LLM approach will help prevent AMR, which is a global problem. Understanding this approach may enable researchers to develop more effective treatment strategies that target specific resistant genes, reducing the likelihood of resistance development. Ultimately, this approach could play a pivotal role in preventing AMR on a global scale.\",\"PeriodicalId\":11191,\"journal\":{\"name\":\"Dental and Medical Problems\",\"volume\":\"62 2\",\"pages\":\"265-273\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Dental and Medical Problems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.17219/dmp/186143\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"DENTISTRY, ORAL SURGERY & MEDICINE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Dental and Medical Problems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17219/dmp/186143","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"DENTISTRY, ORAL SURGERY & MEDICINE","Score":null,"Total":0}

引用次数: 0

摘要

背景：必须预测抗菌素耐药性（AMR）以对抗抗生素耐药疾病。基于高度优先的抗菌素耐药性基因组，有可能跟踪耐药性并集中治疗以阻止全球疫情。大语言模型（LLMs）是鉴别牙龈卟啉单胞菌多耐药外排基因以预防耐药的必要手段。抗生素耐药性是一个严重的问题；然而，通过研究特定的细菌基因组，我们可以预测耐药性是如何产生的，并找到更好的治疗方法。目的：探讨利用先进的模型预测使牙龈卟啉单胞菌耐药的蛋白序列。了解这种方法有助于更有效地预防抗菌素耐药性。材料与方法：本研究利用牙龈卟啉卟啉多耐药外排蛋白序列，通过UniProt ID A0A0K2J2N6_PORGN鉴定，格式化为FASTA序列进行分析。这些序列经过严格的检测和质量保证过程，以确保它们适合计算分析。该研究采用了DeepBIO框架，该框架将llm与深度注意网络集成在一起，以处理FASTA序列。结果：长短期记忆-注意模型、ProtBERT模型和BERTGAT模型的灵敏度评分均为0.9，正确率分别为89.5%、88.5%和90.5%。这些结果突出了该模型在鉴定耐多药牙龈卟啉菌菌株方面的有效性。此外，本研究还评估了LSTM-attention、ProtBERT和BERTGAT模型的特异性，这三个模型的得分分别为0.89、0.87和0.90。特异性或真正阴性率衡量模型准确识别非耐药病例的能力，这对于最大限度地减少抗菌素耐药性检测中的假阳性至关重要。结论：在临床应用时，LLM方法有助于预防AMR，这是一个全球性的问题。了解这种方法可能使研究人员能够开发更有效的治疗策略，针对特定的耐药基因，减少耐药性发展的可能性。最终，这种方法可以在全球范围内预防抗菌素耐药性方面发挥关键作用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Analyzing and exploring Graph Attention Networks and protein-based language models for predicting Porhyromonas gingivalis resistant efflux protein sequences.

Background: Antimicrobial resistance (AMR) must be predicted to combat antibiotic-resistant illnesses. Based on high-priority AMR genomes, it is possible to track resistance and focus treatment to stop global outbreaks. Large language models (LLMs) are essential for identifying Porhyromonas gingivalis multiresistant efflux genes to prevent resistance. Antibiotic resistance is a serious problem; however, by studying specific bacterial genomes, we can predict how resistance develops and find better kinds of treatment.

Objectives: This paper explores using advanced models to predict the sequences of proteins that make P. gingivalis resistant to treatment. Understanding this approach could help prevent AMR more effectively.

Material and methods: This research utilized multi-drug-resistant efflux protein sequences from P. gingivalis, identified through UniProt ID A0A0K2J2N6_PORGN, and formatted as FASTA sequences for analysis. These sequences underwent rigorous detection and quality assurance processes to ensure their suitability for computational analysis. The study employed the DeepBIO framework, which integrates LLMs with deep attention networks to process FASTA sequences.

Results: The analysis revealed that the Long Short-Term Memory (LSTM)-attention, ProtBERT and BERTGAT models achieved sensitivity scores of 0.9 across the board, with accuracy rates of 89.5%, 88.5% and 90.5%, respectively. These results highlight the effectiveness of the models in identifying P. gingivalis strains resistant to multiple drugs. Furthermore, the study assessed the specificity of the LSTM-attention, ProtBERT and BERTGAT models, which achieved scores of 0.89, 0.87 and 0.90, respectively. Specificity, or the genuine negative rate, measures the ability of a model to accurately identify non-resistant cases, which is crucial for minimizing false positives in AMR detection.

Conclusions: When utilized clinically, this LLM approach will help prevent AMR, which is a global problem. Understanding this approach may enable researchers to develop more effective treatment strategies that target specific resistant genes, reducing the likelihood of resistance development. Ultimately, this approach could play a pivotal role in preventing AMR on a global scale.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Dental and Medical Problems Multiple-

CiteScore

4.00

自引率

3.80%

发文量

审稿时长

53 weeks