Using multiple convolutional window scanning of convolutional neural network for an efficient prediction of ATP‐binding sites in transport proteins

Trinh-trung-duong Nguyen, Syun Chen, Quang-Thai Ho, Yu-Yen Ou
{"title":"Using multiple convolutional window scanning of convolutional neural network for an efficient prediction of ATP‐binding sites in transport proteins","authors":"Trinh-trung-duong Nguyen, Syun Chen, Quang-Thai Ho, Yu-Yen Ou","doi":"10.1002/prot.26329","DOIUrl":null,"url":null,"abstract":"Protein multiple sequence alignment information has long been important features to know about functions of proteins inferred from related sequences with known functions. It is therefore one of the underlying ideas of Alpha fold 2, a breakthrough study and model for the prediction of three‐dimensional structures of proteins from their primary sequence. Our study used protein multiple sequence alignment information in the form of position‐specific scoring matrices as input. We also refined the use of a convolutional neural network, a well‐known deep‐learning architecture with impressive achievement on image and image‐like data. Specifically, we revisited the study of prediction of adenosine triphosphate (ATP)‐binding sites with more efficient convolutional neural networks. We applied multiple convolutional window scanning filters of a convolutional neural network on position‐specific scoring matrices for as much as useful information as possible. Furthermore, only the most specific motifs are retained at each feature map output through the one‐max pooling layer before going to the next layer. We assumed that this way could help us retain the most conserved motifs which are discriminative information for prediction. Our experiment results show that a convolutional neural network with not too many convolutional layers can be enough to extract the conserved information of proteins, which leads to higher performance. Our best prediction models were obtained after examining them with different hyper‐parameters. Our experiment results showed that our models were superior to traditional use of convolutional neural networks on the same datasets as well as other machine‐learning classification algorithms.","PeriodicalId":20789,"journal":{"name":"Proteins: Structure","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proteins: Structure","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/prot.26329","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Protein multiple sequence alignment information has long been important features to know about functions of proteins inferred from related sequences with known functions. It is therefore one of the underlying ideas of Alpha fold 2, a breakthrough study and model for the prediction of three‐dimensional structures of proteins from their primary sequence. Our study used protein multiple sequence alignment information in the form of position‐specific scoring matrices as input. We also refined the use of a convolutional neural network, a well‐known deep‐learning architecture with impressive achievement on image and image‐like data. Specifically, we revisited the study of prediction of adenosine triphosphate (ATP)‐binding sites with more efficient convolutional neural networks. We applied multiple convolutional window scanning filters of a convolutional neural network on position‐specific scoring matrices for as much as useful information as possible. Furthermore, only the most specific motifs are retained at each feature map output through the one‐max pooling layer before going to the next layer. We assumed that this way could help us retain the most conserved motifs which are discriminative information for prediction. Our experiment results show that a convolutional neural network with not too many convolutional layers can be enough to extract the conserved information of proteins, which leads to higher performance. Our best prediction models were obtained after examining them with different hyper‐parameters. Our experiment results showed that our models were superior to traditional use of convolutional neural networks on the same datasets as well as other machine‐learning classification algorithms.
利用卷积神经网络的多重卷积窗口扫描有效预测转运蛋白中的ATP结合位点
长期以来,蛋白质多序列比对信息一直是从已知功能的相关序列中推断蛋白质功能的重要特征。因此,它是Alpha fold 2的基本思想之一,Alpha fold 2是一项突破性的研究和模型,用于从蛋白质的初级序列预测蛋白质的三维结构。我们的研究以位置特异性评分矩阵的形式使用蛋白质多序列比对信息作为输入。我们还改进了卷积神经网络的使用,卷积神经网络是一种众所周知的深度学习架构,在图像和类图像数据方面取得了令人印象深刻的成就。具体来说,我们重新研究了用更有效的卷积神经网络预测三磷酸腺苷(ATP)结合位点的研究。我们将卷积神经网络的多个卷积窗口扫描滤波器应用于位置特定评分矩阵,以获得尽可能多的有用信息。此外,在进入下一层之前,通过一个最大池化层输出的每个特征映射中只保留最特定的主题。我们认为这种方法可以帮助我们保留最保守的基序,这些基序是预测的判别信息。我们的实验结果表明,一个没有太多卷积层的卷积神经网络就足以提取蛋白质的保守信息,从而获得更高的性能。我们的最佳预测模型是在用不同的超参数检验后得到的。我们的实验结果表明,在相同的数据集上,我们的模型优于传统的卷积神经网络,也优于其他机器学习分类算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信