Bo Wen, Chenwei Wang, Kai Li, Ping Han, Matthew V. Holt, Sara R. Savage, Jonathan T. Lei, Yongchao Dou, Zhiao Shi, Yi Li, Bing Zhang
{"title":"DeepMVP:经过高质量数据训练的深度学习模型可以准确预测PTM位点和变异引起的变化。","authors":"Bo Wen, Chenwei Wang, Kai Li, Ping Han, Matthew V. Holt, Sara R. Savage, Jonathan T. Lei, Yongchao Dou, Zhiao Shi, Yi Li, Bing Zhang","doi":"10.1038/s41592-025-02797-x","DOIUrl":null,"url":null,"abstract":"Post-translational modifications (PTMs) are critical regulators of protein function, and their disruption is a key mechanism by which missense variants contribute to disease. Accurate PTM site prediction using deep learning can help identify PTM-altering variants, but progress has been limited by the lack of large, high-quality training datasets. Here, we introduce PTMAtlas, a curated compendium of 397,524 PTM sites generated through systematic reprocessing of 241 public mass-spectrometry datasets, and DeepMVP, a deep learning framework trained on PTMAtlas to predict PTM sites for phosphorylation, acetylation, methylation, sumoylation, ubiquitination and N-glycosylation. DeepMVP substantially outperforms existing tools across all six PTM types. Its application to predicting PTM-altering missense variants shows strong concordance with experimental results, validated using literature-curated variants and cancer proteogenomic datasets. Together, PTMAtlas and DeepMVP provide a robust platform for PTM research and a scalable framework for assessing the functional consequences of coding variants through the lens of PTMs. DeepMVP is a deep learning framework for predicting PTM sites and variant-induced alterations across six modification types, including phosphorylation, acetylation, methylation, sumoylation, ubiquitination and N-glycosylation.","PeriodicalId":18981,"journal":{"name":"Nature Methods","volume":"22 9","pages":"1857-1867"},"PeriodicalIF":32.1000,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12446062/pdf/","citationCount":"0","resultStr":"{\"title\":\"DeepMVP: deep learning models trained on high-quality data accurately predict PTM sites and variant-induced alterations\",\"authors\":\"Bo Wen, Chenwei Wang, Kai Li, Ping Han, Matthew V. Holt, Sara R. Savage, Jonathan T. Lei, Yongchao Dou, Zhiao Shi, Yi Li, Bing Zhang\",\"doi\":\"10.1038/s41592-025-02797-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Post-translational modifications (PTMs) are critical regulators of protein function, and their disruption is a key mechanism by which missense variants contribute to disease. Accurate PTM site prediction using deep learning can help identify PTM-altering variants, but progress has been limited by the lack of large, high-quality training datasets. Here, we introduce PTMAtlas, a curated compendium of 397,524 PTM sites generated through systematic reprocessing of 241 public mass-spectrometry datasets, and DeepMVP, a deep learning framework trained on PTMAtlas to predict PTM sites for phosphorylation, acetylation, methylation, sumoylation, ubiquitination and N-glycosylation. DeepMVP substantially outperforms existing tools across all six PTM types. Its application to predicting PTM-altering missense variants shows strong concordance with experimental results, validated using literature-curated variants and cancer proteogenomic datasets. Together, PTMAtlas and DeepMVP provide a robust platform for PTM research and a scalable framework for assessing the functional consequences of coding variants through the lens of PTMs. DeepMVP is a deep learning framework for predicting PTM sites and variant-induced alterations across six modification types, including phosphorylation, acetylation, methylation, sumoylation, ubiquitination and N-glycosylation.\",\"PeriodicalId\":18981,\"journal\":{\"name\":\"Nature Methods\",\"volume\":\"22 9\",\"pages\":\"1857-1867\"},\"PeriodicalIF\":32.1000,\"publicationDate\":\"2025-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12446062/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nature Methods\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://www.nature.com/articles/s41592-025-02797-x\",\"RegionNum\":1,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Methods","FirstCategoryId":"99","ListUrlMain":"https://www.nature.com/articles/s41592-025-02797-x","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
DeepMVP: deep learning models trained on high-quality data accurately predict PTM sites and variant-induced alterations
Post-translational modifications (PTMs) are critical regulators of protein function, and their disruption is a key mechanism by which missense variants contribute to disease. Accurate PTM site prediction using deep learning can help identify PTM-altering variants, but progress has been limited by the lack of large, high-quality training datasets. Here, we introduce PTMAtlas, a curated compendium of 397,524 PTM sites generated through systematic reprocessing of 241 public mass-spectrometry datasets, and DeepMVP, a deep learning framework trained on PTMAtlas to predict PTM sites for phosphorylation, acetylation, methylation, sumoylation, ubiquitination and N-glycosylation. DeepMVP substantially outperforms existing tools across all six PTM types. Its application to predicting PTM-altering missense variants shows strong concordance with experimental results, validated using literature-curated variants and cancer proteogenomic datasets. Together, PTMAtlas and DeepMVP provide a robust platform for PTM research and a scalable framework for assessing the functional consequences of coding variants through the lens of PTMs. DeepMVP is a deep learning framework for predicting PTM sites and variant-induced alterations across six modification types, including phosphorylation, acetylation, methylation, sumoylation, ubiquitination and N-glycosylation.
期刊介绍:
Nature Methods is a monthly journal that focuses on publishing innovative methods and substantial enhancements to fundamental life sciences research techniques. Geared towards a diverse, interdisciplinary readership of researchers in academia and industry engaged in laboratory work, the journal offers new tools for research and emphasizes the immediate practical significance of the featured work. It publishes primary research papers and reviews recent technical and methodological advancements, with a particular interest in primary methods papers relevant to the biological and biomedical sciences. This includes methods rooted in chemistry with practical applications for studying biological problems.