以化学为基础的机器学习解释钙结合蛋白传递钙离子原子态变化的模糊形状

Pengzhi Zhang, Jules Nde, Yossi Eliaz, Nathaniel Jennings, Piotr Cieplak, Margaret. S. Cheung
{"title":"以化学为基础的机器学习解释钙结合蛋白传递钙离子原子态变化的模糊形状","authors":"Pengzhi Zhang, Jules Nde, Yossi Eliaz, Nathaniel Jennings, Piotr Cieplak, Margaret. S. Cheung","doi":"arxiv-2407.17017","DOIUrl":null,"url":null,"abstract":"Proteins' fuzziness are features for communicating changes in cell signaling\ninstigated by binding with secondary messengers, such as calcium ions,\nassociated with the coordination of muscle contraction, neurotransmitter\nrelease, and gene expression. Binding with the disordered parts of a protein,\ncalcium ions must balance their charge states with the shape of calcium-binding\nproteins and their versatile pool of partners depending on the circumstances\nthey transmit, but it is unclear whether the limited experimental data\navailable can be used to train models to accurately predict the charges of\ncalcium-binding protein variants. Here, we developed a chemistry-informed,\nmachine-learning algorithm that implements a game theoretic approach to explain\nthe output of a machine-learning model without the prerequisite of an\nexcessively large database for high-performance prediction of atomic charges.\nWe used the ab initio electronic structure data representing calcium ions and\nthe structures of the disordered segments of calcium-binding peptides with\nsurrounding water molecules to train several explainable models. Network theory\nwas used to extract the topological features of atomic interactions in the\nstructurally complex data dictated by the coordination chemistry of a calcium\nion, a potent indicator of its charge state in protein. With our designs, we\nprovided a framework of explainable machine learning model to annotate atomic\ncharges of calcium ions in calcium-binding proteins with domain knowledge in\nresponse to the chemical changes in an environment based on the limited size of\nscientific data in a genome space.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Chemistry-informed Machine Learning Explains Calcium-binding Proteins Fuzzy Shape for Communicating Changes in the Atomic States of Calcium Ions\",\"authors\":\"Pengzhi Zhang, Jules Nde, Yossi Eliaz, Nathaniel Jennings, Piotr Cieplak, Margaret. S. Cheung\",\"doi\":\"arxiv-2407.17017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Proteins' fuzziness are features for communicating changes in cell signaling\\ninstigated by binding with secondary messengers, such as calcium ions,\\nassociated with the coordination of muscle contraction, neurotransmitter\\nrelease, and gene expression. Binding with the disordered parts of a protein,\\ncalcium ions must balance their charge states with the shape of calcium-binding\\nproteins and their versatile pool of partners depending on the circumstances\\nthey transmit, but it is unclear whether the limited experimental data\\navailable can be used to train models to accurately predict the charges of\\ncalcium-binding protein variants. Here, we developed a chemistry-informed,\\nmachine-learning algorithm that implements a game theoretic approach to explain\\nthe output of a machine-learning model without the prerequisite of an\\nexcessively large database for high-performance prediction of atomic charges.\\nWe used the ab initio electronic structure data representing calcium ions and\\nthe structures of the disordered segments of calcium-binding peptides with\\nsurrounding water molecules to train several explainable models. Network theory\\nwas used to extract the topological features of atomic interactions in the\\nstructurally complex data dictated by the coordination chemistry of a calcium\\nion, a potent indicator of its charge state in protein. With our designs, we\\nprovided a framework of explainable machine learning model to annotate atomic\\ncharges of calcium ions in calcium-binding proteins with domain knowledge in\\nresponse to the chemical changes in an environment based on the limited size of\\nscientific data in a genome space.\",\"PeriodicalId\":501266,\"journal\":{\"name\":\"arXiv - QuanBio - Quantitative Methods\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Quantitative Methods\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2407.17017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Quantitative Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.17017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

蛋白质的模糊性是通过与钙离子等次级信使结合来传递细胞信号变化的特征,这些次级信使与肌肉收缩、神经递质释放和基因表达的协调有关。钙离子与蛋白质的无序部分结合时,必须平衡其电荷状态与钙结合蛋白的形状以及它们根据所传递的环境而多变的伙伴库之间的关系,但目前还不清楚有限的实验数据是否可以用来训练模型,以准确预测钙结合蛋白变体的电荷。在这里,我们开发了一种以化学为基础的机器学习算法,该算法采用博弈论的方法来解释机器学习模型的输出结果,而不需要过大的数据库作为高性能原子电荷预测的先决条件。我们利用代表钙离子的ab initio电子结构数据和钙结合肽的无序段结构以及周围的水分子来训练几个可解释的模型。网络理论被用来提取结构复杂的数据中原子相互作用的拓扑特征,这些数据由钙离子的配位化学决定,是蛋白质中电荷状态的有力指标。通过我们的设计,我们提供了一个可解释的机器学习模型框架,利用领域知识注释钙结合蛋白中钙离子的原子电荷,从而根据基因组空间中有限规模的科学数据对环境中的化学变化做出响应。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Chemistry-informed Machine Learning Explains Calcium-binding Proteins Fuzzy Shape for Communicating Changes in the Atomic States of Calcium Ions
Proteins' fuzziness are features for communicating changes in cell signaling instigated by binding with secondary messengers, such as calcium ions, associated with the coordination of muscle contraction, neurotransmitter release, and gene expression. Binding with the disordered parts of a protein, calcium ions must balance their charge states with the shape of calcium-binding proteins and their versatile pool of partners depending on the circumstances they transmit, but it is unclear whether the limited experimental data available can be used to train models to accurately predict the charges of calcium-binding protein variants. Here, we developed a chemistry-informed, machine-learning algorithm that implements a game theoretic approach to explain the output of a machine-learning model without the prerequisite of an excessively large database for high-performance prediction of atomic charges. We used the ab initio electronic structure data representing calcium ions and the structures of the disordered segments of calcium-binding peptides with surrounding water molecules to train several explainable models. Network theory was used to extract the topological features of atomic interactions in the structurally complex data dictated by the coordination chemistry of a calcium ion, a potent indicator of its charge state in protein. With our designs, we provided a framework of explainable machine learning model to annotate atomic charges of calcium ions in calcium-binding proteins with domain knowledge in response to the chemical changes in an environment based on the limited size of scientific data in a genome space.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信