Using multiple convolutional window scanning of convolutional neural network for an efficient prediction of ATP‐binding sites in transport proteins

Proteins: Structure Pub Date : 2022-03-04 DOI:10.1002/prot.26329

Trinh-trung-duong Nguyen, Syun Chen, Quang-Thai Ho, Yu-Yen Ou

{"title":"Using multiple convolutional window scanning of convolutional neural network for an efficient prediction of ATP‐binding sites in transport proteins","authors":"Trinh-trung-duong Nguyen, Syun Chen, Quang-Thai Ho, Yu-Yen Ou","doi":"10.1002/prot.26329","DOIUrl":null,"url":null,"abstract":"Protein multiple sequence alignment information has long been important features to know about functions of proteins inferred from related sequences with known functions. It is therefore one of the underlying ideas of Alpha fold 2, a breakthrough study and model for the prediction of three‐dimensional structures of proteins from their primary sequence. Our study used protein multiple sequence alignment information in the form of position‐specific scoring matrices as input. We also refined the use of a convolutional neural network, a well‐known deep‐learning architecture with impressive achievement on image and image‐like data. Specifically, we revisited the study of prediction of adenosine triphosphate (ATP)‐binding sites with more efficient convolutional neural networks. We applied multiple convolutional window scanning filters of a convolutional neural network on position‐specific scoring matrices for as much as useful information as possible. Furthermore, only the most specific motifs are retained at each feature map output through the one‐max pooling layer before going to the next layer. We assumed that this way could help us retain the most conserved motifs which are discriminative information for prediction. Our experiment results show that a convolutional neural network with not too many convolutional layers can be enough to extract the conserved information of proteins, which leads to higher performance. Our best prediction models were obtained after examining them with different hyper‐parameters. Our experiment results showed that our models were superior to traditional use of convolutional neural networks on the same datasets as well as other machine‐learning classification algorithms.","PeriodicalId":20789,"journal":{"name":"Proteins: Structure","volume":"7 1","pages":"1486 - 1492"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proteins: Structure","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/prot.26329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Protein multiple sequence alignment information has long been important features to know about functions of proteins inferred from related sequences with known functions. It is therefore one of the underlying ideas of Alpha fold 2, a breakthrough study and model for the prediction of three‐dimensional structures of proteins from their primary sequence. Our study used protein multiple sequence alignment information in the form of position‐specific scoring matrices as input. We also refined the use of a convolutional neural network, a well‐known deep‐learning architecture with impressive achievement on image and image‐like data. Specifically, we revisited the study of prediction of adenosine triphosphate (ATP)‐binding sites with more efficient convolutional neural networks. We applied multiple convolutional window scanning filters of a convolutional neural network on position‐specific scoring matrices for as much as useful information as possible. Furthermore, only the most specific motifs are retained at each feature map output through the one‐max pooling layer before going to the next layer. We assumed that this way could help us retain the most conserved motifs which are discriminative information for prediction. Our experiment results show that a convolutional neural network with not too many convolutional layers can be enough to extract the conserved information of proteins, which leads to higher performance. Our best prediction models were obtained after examining them with different hyper‐parameters. Our experiment results showed that our models were superior to traditional use of convolutional neural networks on the same datasets as well as other machine‐learning classification algorithms.

查看原文本刊更多论文

利用卷积神经网络的多重卷积窗口扫描有效预测转运蛋白中的ATP结合位点

长期以来，蛋白质多序列比对信息一直是从已知功能的相关序列中推断蛋白质功能的重要特征。因此，它是Alpha fold 2的基本思想之一，Alpha fold 2是一项突破性的研究和模型，用于从蛋白质的初级序列预测蛋白质的三维结构。我们的研究以位置特异性评分矩阵的形式使用蛋白质多序列比对信息作为输入。我们还改进了卷积神经网络的使用，卷积神经网络是一种众所周知的深度学习架构，在图像和类图像数据方面取得了令人印象深刻的成就。具体来说，我们重新研究了用更有效的卷积神经网络预测三磷酸腺苷(ATP)结合位点的研究。我们将卷积神经网络的多个卷积窗口扫描滤波器应用于位置特定评分矩阵，以获得尽可能多的有用信息。此外，在进入下一层之前，通过一个最大池化层输出的每个特征映射中只保留最特定的主题。我们认为这种方法可以帮助我们保留最保守的基序，这些基序是预测的判别信息。我们的实验结果表明，一个没有太多卷积层的卷积神经网络就足以提取蛋白质的保守信息，从而获得更高的性能。我们的最佳预测模型是在用不同的超参数检验后得到的。我们的实验结果表明，在相同的数据集上，我们的模型优于传统的卷积神经网络，也优于其他机器学习分类算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proteins: Structure

自引率

0.00%

发文量