Improvement in positional accuracy of neural-network predicted hydration sites of proteins by incorporating atomic details of water-protein interactions and site-searching algorithm.
{"title":"Improvement in positional accuracy of neural-network predicted hydration sites of proteins by incorporating atomic details of water-protein interactions and site-searching algorithm.","authors":"Kochi Sato, Masayoshi Nakasako","doi":"10.2142/biophysico.bppb-v22.0004","DOIUrl":null,"url":null,"abstract":"<p><p>Visualization of hydration structures over the entire protein surface is necessary to understand why the aqueous environment is essential for protein folding and functions. However, it is still difficult for experiments. Recently, we developed a convolutional neural network (CNN) to predict the probability distribution of hydration water molecules over protein surfaces and in protein cavities. The deep network was optimized using solely the distribution patterns of protein atoms surrounding each hydration water molecule in high-resolution X-ray crystal structures and successfully provided probability distributions of hydration water molecules. Despite the effectiveness of the probability distribution, the positional differences of the predicted positions obtained from the local maxima as predicted sites remained inadequate in reproducing the hydration sites in the crystal structure models. In this work, we modified the deep network by subdividing atomic classes based on the electronic properties of atoms composing amino acids. In addition, the exclusion volumes of each protein atom and hydration water molecule were taken to predict the hydration sites from the probability distribution. These information on chemical properties of atoms leads to an improvement in positional prediction accuracy. We selected the best CNN from 47 CNNs constructed by systematically varying the number of channels and layers of neural networks. Here, we report the improvements in prediction accuracy by the reorganized CNN together with the details in the architecture, training data, and peak search algorithm.</p>","PeriodicalId":101323,"journal":{"name":"Biophysics and physicobiology","volume":"22 1","pages":"e220004"},"PeriodicalIF":1.6000,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11876803/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biophysics and physicobiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2142/biophysico.bppb-v22.0004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q4","JCRName":"BIOPHYSICS","Score":null,"Total":0}
引用次数: 0
Abstract
Visualization of hydration structures over the entire protein surface is necessary to understand why the aqueous environment is essential for protein folding and functions. However, it is still difficult for experiments. Recently, we developed a convolutional neural network (CNN) to predict the probability distribution of hydration water molecules over protein surfaces and in protein cavities. The deep network was optimized using solely the distribution patterns of protein atoms surrounding each hydration water molecule in high-resolution X-ray crystal structures and successfully provided probability distributions of hydration water molecules. Despite the effectiveness of the probability distribution, the positional differences of the predicted positions obtained from the local maxima as predicted sites remained inadequate in reproducing the hydration sites in the crystal structure models. In this work, we modified the deep network by subdividing atomic classes based on the electronic properties of atoms composing amino acids. In addition, the exclusion volumes of each protein atom and hydration water molecule were taken to predict the hydration sites from the probability distribution. These information on chemical properties of atoms leads to an improvement in positional prediction accuracy. We selected the best CNN from 47 CNNs constructed by systematically varying the number of channels and layers of neural networks. Here, we report the improvements in prediction accuracy by the reorganized CNN together with the details in the architecture, training data, and peak search algorithm.