Accelerated Missense Mutation Identification in Intrinsically Disordered Proteins Using Deep Learning

IF 5.5 2区 化学 Q1 BIOCHEMISTRY & MOLECULAR BIOLOGY
Swarnadeep Seth,  and , Aniket Bhattacharya*, 
{"title":"Accelerated Missense Mutation Identification in Intrinsically Disordered Proteins Using Deep Learning","authors":"Swarnadeep Seth,&nbsp; and ,&nbsp;Aniket Bhattacharya*,&nbsp;","doi":"10.1021/acs.biomac.4c0112410.1021/acs.biomac.4c01124","DOIUrl":null,"url":null,"abstract":"<p >We use a combination of Brownian dynamics (BD) simulation results and deep learning (DL) strategies for the rapid identification of large structural changes caused by missense mutations in intrinsically disordered proteins (IDPs). We used ∼6500 IDP sequences from MobiDB database of length 20–300 to obtain gyration radii from BD simulation on a coarse-grained single-bead amino acid model (HPS2 model) used by us and others [<contrib-group><span>Dignon, G. L.</span></contrib-group> <cite><i>PLoS Comput. Biol.</i></cite> <span>2018</span>, <em>14</em>, <elocation-id>e1005941</elocation-id>,<contrib-group><span>Tesei, G.</span></contrib-group> <cite><i>Proc. Natl. Acad. Sci. U.S.A.</i></cite> <span>2021</span>, <em>118</em>, <elocation-id>e2111696118</elocation-id>,<contrib-group><span>Seth, S.</span></contrib-group> <cite><i>J. Chem. Phys.</i></cite> <span>2024</span>, <em>160</em>, <elocation-id>014902</elocation-id>] to generate the training sets for the DL algorithm. Using the gyration radii ⟨<i>R</i><sub>g</sub>⟩ of the simulated IDPs as the training set, we develop a multilayer perceptron neural net (NN) architecture that predicts the gyration radii of 33 IDPs previously studied by using BD simulation with 97% accuracy from the sequence and the corresponding parameters from the HPS model. We now utilize this NN to predict gyration radii of every permutation of missense mutations in IDPs. Our approach successfully identifies mutation-prone regions that induce significant alterations in the radius of gyration when compared to the wild-type IDP sequence. We further validate the prediction by running BD simulations on the subset of identified mutants. The neural network yields a (10<sup>4</sup>–10<sup>6</sup>)-fold faster computation in the search space for potentially harmful mutations. Our findings have substantial implications for rapid identification and understanding of diseases related to missense mutations in IDPs and for the development of potential therapeutic interventions. The method can be extended to accurate predictions of other mutation effects in disordered proteins.</p>","PeriodicalId":30,"journal":{"name":"Biomacromolecules","volume":"26 4","pages":"2106–2115 2106–2115"},"PeriodicalIF":5.5000,"publicationDate":"2025-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomacromolecules","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.biomac.4c01124","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

We use a combination of Brownian dynamics (BD) simulation results and deep learning (DL) strategies for the rapid identification of large structural changes caused by missense mutations in intrinsically disordered proteins (IDPs). We used ∼6500 IDP sequences from MobiDB database of length 20–300 to obtain gyration radii from BD simulation on a coarse-grained single-bead amino acid model (HPS2 model) used by us and others [Dignon, G. L. PLoS Comput. Biol. 2018, 14, e1005941,Tesei, G. Proc. Natl. Acad. Sci. U.S.A. 2021, 118, e2111696118,Seth, S. J. Chem. Phys. 2024, 160, 014902] to generate the training sets for the DL algorithm. Using the gyration radii ⟨Rg⟩ of the simulated IDPs as the training set, we develop a multilayer perceptron neural net (NN) architecture that predicts the gyration radii of 33 IDPs previously studied by using BD simulation with 97% accuracy from the sequence and the corresponding parameters from the HPS model. We now utilize this NN to predict gyration radii of every permutation of missense mutations in IDPs. Our approach successfully identifies mutation-prone regions that induce significant alterations in the radius of gyration when compared to the wild-type IDP sequence. We further validate the prediction by running BD simulations on the subset of identified mutants. The neural network yields a (104–106)-fold faster computation in the search space for potentially harmful mutations. Our findings have substantial implications for rapid identification and understanding of diseases related to missense mutations in IDPs and for the development of potential therapeutic interventions. The method can be extended to accurate predictions of other mutation effects in disordered proteins.

Abstract Image

利用深度学习加速内在无序蛋白的错义突变识别
我们将布朗动力学(BD)模拟结果与深度学习(DL)策略相结合,用于快速识别由内在无序蛋白(IDPs)错义突变引起的大结构变化。我们使用了MobiDB数据库中长度为20-300的~ 6500个IDP序列,在我们和其他人[Dignon, G. L. PLoS Comput]使用的粗粒度单粒氨基酸模型(HPS2模型)上进行BD模拟,获得了旋转半径。中国生物医学工程学报,2018,35(4):591 - 591。学会科学。[2]张建军,张建军,张建军,等。[j] .物理学报,2024,160,014902]生成DL算法的训练集。使用模拟IDPs的旋转半径⟨Rg⟩作为训练集,我们开发了一个多层感知器神经网络(NN)架构,该架构预测了先前通过使用BD模拟研究的33 IDPs的旋转半径,从序列和HPS模型的相应参数中获得97%的精度。我们现在利用这个神经网络来预测IDPs中每一个错义突变排列的旋转半径。与野生型IDP序列相比,我们的方法成功地识别了易突变区域,这些区域诱导了旋转半径的显著改变。我们通过在已识别的突变体子集上运行BD模拟进一步验证了预测。神经网络在搜索潜在有害突变的空间中产生了(104-106)倍的计算速度。我们的发现对于快速识别和理解与IDPs中错义突变相关的疾病以及开发潜在的治疗干预措施具有重大意义。该方法可以推广到对其他无序蛋白质突变效应的准确预测。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Biomacromolecules
Biomacromolecules 化学-高分子科学
CiteScore
10.60
自引率
4.80%
发文量
417
审稿时长
1.6 months
期刊介绍: Biomacromolecules is a leading forum for the dissemination of cutting-edge research at the interface of polymer science and biology. Submissions to Biomacromolecules should contain strong elements of innovation in terms of macromolecular design, synthesis and characterization, or in the application of polymer materials to biology and medicine. Topics covered by Biomacromolecules include, but are not exclusively limited to: sustainable polymers, polymers based on natural and renewable resources, degradable polymers, polymer conjugates, polymeric drugs, polymers in biocatalysis, biomacromolecular assembly, biomimetic polymers, polymer-biomineral hybrids, biomimetic-polymer processing, polymer recycling, bioactive polymer surfaces, original polymer design for biomedical applications such as immunotherapy, drug delivery, gene delivery, antimicrobial applications, diagnostic imaging and biosensing, polymers in tissue engineering and regenerative medicine, polymeric scaffolds and hydrogels for cell culture and delivery.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信