Pengzhi Zhang, Jules Nde, Yossi Eliaz, Nathaniel Jennings, Piotr Cieplak, Margaret. S. Cheung
{"title":"以化学为基础的机器学习解释钙结合蛋白传递钙离子原子态变化的模糊形状","authors":"Pengzhi Zhang, Jules Nde, Yossi Eliaz, Nathaniel Jennings, Piotr Cieplak, Margaret. S. Cheung","doi":"arxiv-2407.17017","DOIUrl":null,"url":null,"abstract":"Proteins' fuzziness are features for communicating changes in cell signaling\ninstigated by binding with secondary messengers, such as calcium ions,\nassociated with the coordination of muscle contraction, neurotransmitter\nrelease, and gene expression. Binding with the disordered parts of a protein,\ncalcium ions must balance their charge states with the shape of calcium-binding\nproteins and their versatile pool of partners depending on the circumstances\nthey transmit, but it is unclear whether the limited experimental data\navailable can be used to train models to accurately predict the charges of\ncalcium-binding protein variants. Here, we developed a chemistry-informed,\nmachine-learning algorithm that implements a game theoretic approach to explain\nthe output of a machine-learning model without the prerequisite of an\nexcessively large database for high-performance prediction of atomic charges.\nWe used the ab initio electronic structure data representing calcium ions and\nthe structures of the disordered segments of calcium-binding peptides with\nsurrounding water molecules to train several explainable models. Network theory\nwas used to extract the topological features of atomic interactions in the\nstructurally complex data dictated by the coordination chemistry of a calcium\nion, a potent indicator of its charge state in protein. With our designs, we\nprovided a framework of explainable machine learning model to annotate atomic\ncharges of calcium ions in calcium-binding proteins with domain knowledge in\nresponse to the chemical changes in an environment based on the limited size of\nscientific data in a genome space.","PeriodicalId":501266,"journal":{"name":"arXiv - QuanBio - Quantitative Methods","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Chemistry-informed Machine Learning Explains Calcium-binding Proteins Fuzzy Shape for Communicating Changes in the Atomic States of Calcium Ions\",\"authors\":\"Pengzhi Zhang, Jules Nde, Yossi Eliaz, Nathaniel Jennings, Piotr Cieplak, Margaret. S. Cheung\",\"doi\":\"arxiv-2407.17017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Proteins' fuzziness are features for communicating changes in cell signaling\\ninstigated by binding with secondary messengers, such as calcium ions,\\nassociated with the coordination of muscle contraction, neurotransmitter\\nrelease, and gene expression. Binding with the disordered parts of a protein,\\ncalcium ions must balance their charge states with the shape of calcium-binding\\nproteins and their versatile pool of partners depending on the circumstances\\nthey transmit, but it is unclear whether the limited experimental data\\navailable can be used to train models to accurately predict the charges of\\ncalcium-binding protein variants. Here, we developed a chemistry-informed,\\nmachine-learning algorithm that implements a game theoretic approach to explain\\nthe output of a machine-learning model without the prerequisite of an\\nexcessively large database for high-performance prediction of atomic charges.\\nWe used the ab initio electronic structure data representing calcium ions and\\nthe structures of the disordered segments of calcium-binding peptides with\\nsurrounding water molecules to train several explainable models. Network theory\\nwas used to extract the topological features of atomic interactions in the\\nstructurally complex data dictated by the coordination chemistry of a calcium\\nion, a potent indicator of its charge state in protein. With our designs, we\\nprovided a framework of explainable machine learning model to annotate atomic\\ncharges of calcium ions in calcium-binding proteins with domain knowledge in\\nresponse to the chemical changes in an environment based on the limited size of\\nscientific data in a genome space.\",\"PeriodicalId\":501266,\"journal\":{\"name\":\"arXiv - QuanBio - Quantitative Methods\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Quantitative Methods\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2407.17017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Quantitative Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.17017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Chemistry-informed Machine Learning Explains Calcium-binding Proteins Fuzzy Shape for Communicating Changes in the Atomic States of Calcium Ions
Proteins' fuzziness are features for communicating changes in cell signaling
instigated by binding with secondary messengers, such as calcium ions,
associated with the coordination of muscle contraction, neurotransmitter
release, and gene expression. Binding with the disordered parts of a protein,
calcium ions must balance their charge states with the shape of calcium-binding
proteins and their versatile pool of partners depending on the circumstances
they transmit, but it is unclear whether the limited experimental data
available can be used to train models to accurately predict the charges of
calcium-binding protein variants. Here, we developed a chemistry-informed,
machine-learning algorithm that implements a game theoretic approach to explain
the output of a machine-learning model without the prerequisite of an
excessively large database for high-performance prediction of atomic charges.
We used the ab initio electronic structure data representing calcium ions and
the structures of the disordered segments of calcium-binding peptides with
surrounding water molecules to train several explainable models. Network theory
was used to extract the topological features of atomic interactions in the
structurally complex data dictated by the coordination chemistry of a calcium
ion, a potent indicator of its charge state in protein. With our designs, we
provided a framework of explainable machine learning model to annotate atomic
charges of calcium ions in calcium-binding proteins with domain knowledge in
response to the chemical changes in an environment based on the limited size of
scientific data in a genome space.