{"title":"Explore Conformational Space of Proteins with Supervised Auto-Encoder","authors":"None Chen Guanglin, None Zhang Zhiyong","doi":"10.7498/aps.72.20231060","DOIUrl":null,"url":null,"abstract":"Protein function is related to its structure and dynamics. Molecular dynamics simulation is an important tool in the study of protein dynamics by exploring its conformational space, however, conformational sampling is a nontrivial issue, since the risk of missing key details due to under-sampling. In recent years, deep learning methods, such as auto-encoder, can couple with MD to explore conformational space of proteins. After training with the MD trajectories, auto-encoder can generate new conformations quickly by inputting random numbers in low dimension space. However, some issues still remain, such as requirements for the quality of the training set, the limitation of explorable area and the undefined sampling direction. In this work, we have built a supervised auto-encoder, in which some reaction coordinates are used to guide conformational exploration alone certain directions. We have also tried expanding the explorable area by training with the data generated by the model. Two multi-domain proteins, bacteriophage T4 lysozyme and adenylate kinase, were used to illustrate the method. In the case of the training set consisting of only under-sampled simulated trajectories, the supervised auto-encoder can still explore alone the given reaction coordinates. The explored conformational space can cover all the experimental structures of the proteins and be extended to regions far from the training sets. Having been verified by molecular dynamics and secondary structure calculations, most of the conformations explored were found to be plausible. The supervised auto-encoder provides a way to efficiently expand the conformational space of a protein with limited computational resources, although some suitable reaction coordinates is required. By integrate appropriate reaction coordinates or experimental data, the supervised auto-encoder may serve as an efficient tool for exploring conformational space of proteins.","PeriodicalId":10252,"journal":{"name":"Chinese Physics","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Chinese Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.7498/aps.72.20231060","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Protein function is related to its structure and dynamics. Molecular dynamics simulation is an important tool in the study of protein dynamics by exploring its conformational space, however, conformational sampling is a nontrivial issue, since the risk of missing key details due to under-sampling. In recent years, deep learning methods, such as auto-encoder, can couple with MD to explore conformational space of proteins. After training with the MD trajectories, auto-encoder can generate new conformations quickly by inputting random numbers in low dimension space. However, some issues still remain, such as requirements for the quality of the training set, the limitation of explorable area and the undefined sampling direction. In this work, we have built a supervised auto-encoder, in which some reaction coordinates are used to guide conformational exploration alone certain directions. We have also tried expanding the explorable area by training with the data generated by the model. Two multi-domain proteins, bacteriophage T4 lysozyme and adenylate kinase, were used to illustrate the method. In the case of the training set consisting of only under-sampled simulated trajectories, the supervised auto-encoder can still explore alone the given reaction coordinates. The explored conformational space can cover all the experimental structures of the proteins and be extended to regions far from the training sets. Having been verified by molecular dynamics and secondary structure calculations, most of the conformations explored were found to be plausible. The supervised auto-encoder provides a way to efficiently expand the conformational space of a protein with limited computational resources, although some suitable reaction coordinates is required. By integrate appropriate reaction coordinates or experimental data, the supervised auto-encoder may serve as an efficient tool for exploring conformational space of proteins.