Sitetack: a deep learning model that improves PTM prediction by using known PTMs.

Clair S Gutierrez, Alia A Kassim, Benjamin D Gutierrez, Ronald T Raines
{"title":"Sitetack: a deep learning model that improves PTM prediction by using known PTMs.","authors":"Clair S Gutierrez, Alia A Kassim, Benjamin D Gutierrez, Ronald T Raines","doi":"10.1093/bioinformatics/btae602","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Post-translational modifications (PTMs) increase the diversity of the proteome and are vital to organismal life and therapeutic strategies. Deep learning has been used to predict PTM locations. Still, limitations in datasets and their analyses compromise success.</p><p><strong>Results: </strong>We evaluated the use of known PTM sites in prediction via sequence-based deep learning algorithms. For each PTM, known locations of that PTM were encoded as a separate amino acid before sequences were encoded via word embedding and passed into a convolutional neural network that predicts the probability of that PTM at a given site. Without labeling known PTMs, our models are on par with others. With labeling, however, we improved significantly upon extant models. Moreover, knowing PTM locations can increase the predictability of a different PTM. Our findings highlight the importance of PTMs for the installation of additional PTMs. We anticipate that including known PTM locations will enhance the performance of other proteomic machine learning algorithms.</p><p><strong>Availability and implementation: </strong>Sitetack is available as a web tool at https://sitetack.net; the source code, representative datasets, instructions for local use, and select models are available at https://github.com/clair-gutierrez/sitetack.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11552626/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btae602","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Motivation: Post-translational modifications (PTMs) increase the diversity of the proteome and are vital to organismal life and therapeutic strategies. Deep learning has been used to predict PTM locations. Still, limitations in datasets and their analyses compromise success.

Results: We evaluated the use of known PTM sites in prediction via sequence-based deep learning algorithms. For each PTM, known locations of that PTM were encoded as a separate amino acid before sequences were encoded via word embedding and passed into a convolutional neural network that predicts the probability of that PTM at a given site. Without labeling known PTMs, our models are on par with others. With labeling, however, we improved significantly upon extant models. Moreover, knowing PTM locations can increase the predictability of a different PTM. Our findings highlight the importance of PTMs for the installation of additional PTMs. We anticipate that including known PTM locations will enhance the performance of other proteomic machine learning algorithms.

Availability and implementation: Sitetack is available as a web tool at https://sitetack.net; the source code, representative datasets, instructions for local use, and select models are available at https://github.com/clair-gutierrez/sitetack.

Sitetack:利用已知 PTM 改进 PTM 预测的深度学习模型。
动机翻译后修饰(PTM)增加了蛋白质组的多样性,对生物体生命和治疗策略至关重要。深度学习已被用于预测PTM位置。然而,数据集及其分析的局限性影响了成功率:我们评估了通过基于序列的深度学习算法预测已知 PTM 位点的使用情况。对于每个 PTM,在通过词嵌入对序列进行编码之前,先将该 PTM 的已知位置编码为单独的氨基酸,然后将其输入卷积神经网络,该网络可预测给定位置上该 PTM 的概率。在不标记已知 PTM 的情况下,我们的模型与其他模型相当。但是,在标注后,我们的模型比现有模型有了显著提高。此外,了解 PTM 的位置可以提高对不同 PTM 的预测能力。我们的发现凸显了 PTM 对于安装其他 PTM 的重要性。我们预计,加入已知的 PTM 位置将提高其他蛋白质组机器学习算法的性能:Sitetack 是一种网络工具,可在 https://sitetack.net 网站上获取;源代码、代表性数据集、本地使用说明和精选模型可在 https://github.com/clair-gutierrez/sitetack.Supplementary 信息网站上获取:补充数据可在 Bioinformatics online 上获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信