Swati AdhikariThe University of Burdwan, Dhananjay BhattacharyyaSaha Institute of Nuclear PhysicsUniversity of Calcutta, Parthajit RoyThe University of Burdwan
{"title":"Counterions in RNA structure: Structural bioinformatics analysis to identify the role of Mg2+ ions in base pair formation","authors":"Swati AdhikariThe University of Burdwan, Dhananjay BhattacharyyaSaha Institute of Nuclear PhysicsUniversity of Calcutta, Parthajit RoyThe University of Burdwan","doi":"arxiv-2408.05355","DOIUrl":"https://doi.org/arxiv-2408.05355","url":null,"abstract":"Contribution of metal ions on nucleic acids structures and functions is\u0000undeniable. Among the available metal ions like Na+, K+, Ca2+, Mg2+ etc., the\u0000role that play the Mg2+ ion is very significant related to the stability of the\u0000structures of RNA and this is quite well studied. But it is not possible to\u0000grasp the entire functionality of Mg2+ ion in the structure of RNA. So, to have\u0000a better understanding of the Mg-RNA complexes, in the present study, we have\u0000investigated 1541 non-redundant crystal structures of RNA and generated reports\u0000for various statistics related to these Mg-RNA complexes by computing base\u0000pairs and Mg2+ binding statistics. In this study, it has also been reported\u0000whether the presence of Mg2+ ions can alter the stability of base pairs or not\u0000by computing and comparing the base pairs stability. We noted that the Mg2+\u0000ions do not affect the canonical base pair G:C W:WC while majority of the\u0000non-canonical base pair G:G W:HC, which is important also in DNA telomere\u0000structures, has Magnesium ion binding to O6 or N7 atoms of one of the Guanines.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142216210","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Synthetic High-resolution Cryo-EM Density Maps with Generative Adversarial Networks","authors":"Chenwei Zhang, Anne Condon, Khanh Dao Duc","doi":"arxiv-2407.17674","DOIUrl":"https://doi.org/arxiv-2407.17674","url":null,"abstract":"Generating synthetic cryogenic electron microscopy (cryo-EM) 3D density maps\u0000from molecular structures has potential important applications in structural\u0000biology. Yet existing simulation-based methods cannot mimic all the complex\u0000features present in experimental maps, such as secondary structure elements. As\u0000an alternative, we propose struc2mapGAN, a novel data-driven method that\u0000employs a generative adversarial network (GAN) to produce high-resolution\u0000experimental-like density maps from molecular structures. More specifically,\u0000struc2mapGAN uses a U-Net++ architecture as the generator, with an additional\u0000L1 loss term and further processing of raw experimental maps to enhance\u0000learning efficiency. While struc2mapGAN can promptly generate maps after\u0000training, we demonstrate that it outperforms existing simulation-based methods\u0000for a wide array of tested maps and across various evaluation metrics. Our code\u0000is available at https://github.com/chenwei-zhang/struc2mapGAN.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Ranking protein-protein models with large language models and graph neural networks","authors":"Xiaotong Xu, Alexandre M. J. J. Bonvin","doi":"arxiv-2407.16375","DOIUrl":"https://doi.org/arxiv-2407.16375","url":null,"abstract":"Protein-protein interactions (PPIs) are associated with various diseases,\u0000including cancer, infections, and neurodegenerative disorders. Obtaining\u0000three-dimensional structural information on these PPIs serves as a foundation\u0000to interfere with those or to guide drug design. Various strategies can be\u0000followed to model those complexes, all typically resulting in a large number of\u0000models. A challenging step in this process is the identification of good models\u0000(near-native PPI conformations) from the large pool of generated models. To\u0000address this challenge, we previously developed DeepRank-GNN-esm, a graph-based\u0000deep learning algorithm for ranking modelled PPI structures harnessing the\u0000power of protein language models. Here, we detail the use of our software with\u0000examples. DeepRank-GNN-esm is freely available at\u0000https://github.com/haddocking/DeepRank-GNN-esm","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jacob Sumner, Grace Meng, Naomi Brandt, Alex T. Grigas, Andrés Córdoba, Mark D. Shattuck, Corey S. O'Hern
{"title":"Assessment of scoring functions for computational models of protein-protein interfaces","authors":"Jacob Sumner, Grace Meng, Naomi Brandt, Alex T. Grigas, Andrés Córdoba, Mark D. Shattuck, Corey S. O'Hern","doi":"arxiv-2407.16580","DOIUrl":"https://doi.org/arxiv-2407.16580","url":null,"abstract":"A goal of computational studies of protein-protein interfaces (PPIs) is to\u0000predict the binding site between two monomers that form a heterodimer. The\u0000simplest version of this problem is to rigidly re-dock the bound forms of the\u0000monomers, which involves generating computational models of the heterodimer and\u0000then scoring them to determine the most native-like models. Scoring functions\u0000have been assessed previously using rank- and classification-based metrics,\u0000however, these methods are sensitive to the number and quality of models in the\u0000scoring function training set. We assess the accuracy of seven PPI scoring\u0000functions by comparing their scores to a measure of structural similarity to\u0000the x-ray crystal structure (i.e. the DockQ score) for a non-redundant set of\u0000heterodimers from the Protein Data Bank. For each heterodimer, we generate\u0000re-docked models uniformly sampled over DockQ and calculate the Spearman\u0000correlation between the PPI scores and DockQ. For some targets, the scores and\u0000DockQ are highly correlated; however, for many targets, there are weak\u0000correlations. Several physical features can explain the difference between\u0000difficult- and easy-to-score targets. For example, strong correlations exist\u0000between the score and DockQ for targets with highly intertwined monomers and\u0000many interface contacts. We also develop a new score based on only three\u0000physical features that matches or exceeds the performance of current PPI\u0000scoring functions. These results emphasize that PPI prediction can be improved\u0000by focusing on correlations between the PPI score and DockQ and incorporating\u0000more discriminating physical features into PPI scoring functions.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rommie Amaro, Johan Åqvist, Ivet Bahar, Federica Battistini, Adam Bellaiche, Daniel Beltran, Philip C. Biggin, Massimiliano Bonomi, Gregory R. Bowman, Richard Bryce, Giovanni Bussi, Paolo Carloni, David Case, Andrea Cavalli, Chie-En A. Chang, Thomas E. Cheatham III, Margaret S. Cheung, Cris Chipot, Lillian T. Chong, Preeti Choudhary, Cecilia Clementi, Rosana Collepardo-Guevara, Peter Coveney, T. Daniel Crawford, Matteo Dal Peraro, Bert de Groot, Lucie Delemotte, Marco De Vivo, Jonathan Essex, Franca Fraternali, Jiali Gao, Josep Lluís Gelpí, Francesco Luigi Gervasio, Fernando Danilo Gonzalez-Nilo, Helmut Grubmüller, Marina Guenza, Horacio V. Guzman, Sarah Harris, Teresa Head-Gordon, Rigoberto Hernandez, Adam Hospital, Niu Huang, Xuhui Huang, Gerhard Hummer, Javier Iglesias-Fernández, Jan H. Jensen, Shantenu Jha, Wanting Jiao, Shina Caroline Lynn Kamerlin, Syma Khalid, Charles Laughton, Michael Levitt, Vittorio Limongelli, Erik Lindahl, Kersten Lindorff-Larsen, Sharon Loverde, Magnus Lundborg, Yun Lina Luo, Francisco Javier Luque, Charlotte I. Lynch, Alexander MacKerell, Alessandra Magistrato, Siewert J. Marrink, Hugh Martin, J. Andrew McCammon, Kenneth Merz, Vicent Moliner, Adrian Mulholland, Sohail Murad, Athi N. Naganathan, Shikha Nangia, Frank Noe, Agnes Noy, Julianna Oláh, Megan O'Mara, Mary Jo Ondrechen, José N. Onuchic, Alexey Onufriev, Silvia Osuna, Anna R. Panchenko, Sergio Pantano, Michele Parrinello, Alberto Perez, Tomas Perez-Acle, Juan R. Perilla, B. Montgomery Pettitt, Adriana Pietropalo, Jean-Philip Piquemal, Adolfo Poma, Matej Praprotnik, Maria J. Ramos, Pengyu Ren, Nathalie Reuter, Adrian Roitberg, Edina Rosta, Carme Rovira, Benoit Roux, Ursula Röthlisberger, Karissa Y. Sanbonmatsu, Tamar Schlick, Alexey K. Shaytan, Carlos Simmerling, Jeremy C. Smith, Yuji Sugita, Katarzyna Świderek, Makoto Taiji, Peng Tao, Julian Tirado-Rives, Inaki Tunón, Marc W. Van Der Kamp, David Van der Spoel, Sameer Velankar, Gregory A. Voth, Rebecca Wade, Ariel Warshel, Valerie Vaissier Welborn, Stacey Wetmore, Chung F. Wong, Lee-Wei Yang, Martin Zacharias, Modesto Orozco
{"title":"The need to implement FAIR principles in biomolecular simulations","authors":"Rommie Amaro, Johan Åqvist, Ivet Bahar, Federica Battistini, Adam Bellaiche, Daniel Beltran, Philip C. Biggin, Massimiliano Bonomi, Gregory R. Bowman, Richard Bryce, Giovanni Bussi, Paolo Carloni, David Case, Andrea Cavalli, Chie-En A. Chang, Thomas E. Cheatham III, Margaret S. Cheung, Cris Chipot, Lillian T. Chong, Preeti Choudhary, Cecilia Clementi, Rosana Collepardo-Guevara, Peter Coveney, T. Daniel Crawford, Matteo Dal Peraro, Bert de Groot, Lucie Delemotte, Marco De Vivo, Jonathan Essex, Franca Fraternali, Jiali Gao, Josep Lluís Gelpí, Francesco Luigi Gervasio, Fernando Danilo Gonzalez-Nilo, Helmut Grubmüller, Marina Guenza, Horacio V. Guzman, Sarah Harris, Teresa Head-Gordon, Rigoberto Hernandez, Adam Hospital, Niu Huang, Xuhui Huang, Gerhard Hummer, Javier Iglesias-Fernández, Jan H. Jensen, Shantenu Jha, Wanting Jiao, Shina Caroline Lynn Kamerlin, Syma Khalid, Charles Laughton, Michael Levitt, Vittorio Limongelli, Erik Lindahl, Kersten Lindorff-Larsen, Sharon Loverde, Magnus Lundborg, Yun Lina Luo, Francisco Javier Luque, Charlotte I. Lynch, Alexander MacKerell, Alessandra Magistrato, Siewert J. Marrink, Hugh Martin, J. Andrew McCammon, Kenneth Merz, Vicent Moliner, Adrian Mulholland, Sohail Murad, Athi N. Naganathan, Shikha Nangia, Frank Noe, Agnes Noy, Julianna Oláh, Megan O'Mara, Mary Jo Ondrechen, José N. Onuchic, Alexey Onufriev, Silvia Osuna, Anna R. Panchenko, Sergio Pantano, Michele Parrinello, Alberto Perez, Tomas Perez-Acle, Juan R. Perilla, B. Montgomery Pettitt, Adriana Pietropalo, Jean-Philip Piquemal, Adolfo Poma, Matej Praprotnik, Maria J. Ramos, Pengyu Ren, Nathalie Reuter, Adrian Roitberg, Edina Rosta, Carme Rovira, Benoit Roux, Ursula Röthlisberger, Karissa Y. Sanbonmatsu, Tamar Schlick, Alexey K. Shaytan, Carlos Simmerling, Jeremy C. Smith, Yuji Sugita, Katarzyna Świderek, Makoto Taiji, Peng Tao, Julian Tirado-Rives, Inaki Tunón, Marc W. Van Der Kamp, David Van der Spoel, Sameer Velankar, Gregory A. Voth, Rebecca Wade, Ariel Warshel, Valerie Vaissier Welborn, Stacey Wetmore, Chung F. Wong, Lee-Wei Yang, Martin Zacharias, Modesto Orozco","doi":"arxiv-2407.16584","DOIUrl":"https://doi.org/arxiv-2407.16584","url":null,"abstract":"This letter illustrates the opinion of the molecular dynamics (MD) community\u0000on the need to adopt a new FAIR paradigm for the use of molecular simulations.\u0000It highlights the necessity of a collaborative effort to create, establish, and\u0000sustain a database that allows findability, accessibility, interoperability,\u0000and reusability of molecular dynamics simulation data. Such a development would\u0000democratize the field and significantly improve the impact of MD simulations on\u0000life science research. This will transform our working paradigm, pushing the\u0000field to a new frontier. We invite you to support our initiative at the MDDB\u0000community (https://mddbr.eu/community/)","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"48 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Molecular design for cardiac cell differentiation using a small dataset and decorated shape features","authors":"Fatemeh Etezadi, Shunichi Ito, Kosuke Yasui, Rodi Kado Abdalkader, Itsunari Minami, Motonari Uesugi, Ganesh Pandian Namasivayam, Haruko Nakano, Atsushi Nakano, Daniel M. Packwood","doi":"arxiv-2407.15322","DOIUrl":"https://doi.org/arxiv-2407.15322","url":null,"abstract":"The discovery of small organic compounds for inducing stem cell\u0000differentiation is a time- and resource-intensive process. While data science\u0000could, in principle, facilitate the discovery of these compounds, novel\u0000approaches are required due to the difficulty of acquiring training data from\u0000large numbers of example compounds. In this paper, we demonstrate the design of\u0000a new compound for inducing cardiomyocyte differentiation using simple\u0000regression models trained with a data set containing only 80 examples. We\u0000introduce decorated shape descriptors, an information-rich molecular feature\u0000representation that integrates both molecular shape and hydrophilicity\u0000information. These models demonstrate improved performance compared to ones\u0000using standard molecular descriptors based on shape alone. Model overtraining\u0000is diagnosed using a new type of sensitivity analysis. Our new compound is\u0000designed using a conservative molecular design strategy, and its effectiveness\u0000is confirmed through expression profiles of cardiomyocyte-related marker genes\u0000using real-time polymerase chain reaction experiments on human iPS cell lines.\u0000This work demonstrates a viable data-driven strategy for designing new\u0000compounds for stem cell differentiation protocols and will be useful in\u0000situations where training data is limited.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Holographic nature of critical quantum states of proteins","authors":"Eszter Papp, Gabor Vattay","doi":"arxiv-2407.15101","DOIUrl":"https://doi.org/arxiv-2407.15101","url":null,"abstract":"The Anderson metal-insulator transition is a fundamental phenomenon in\u0000condensed matter physics, describing the transition from a conducting\u0000(metallic) to a non-conducting (insulating) state driven by disorder in a\u0000material. At the critical point of the Anderson transition, wave functions\u0000exhibit multifractal behavior, and energy levels display a universal\u0000distribution, indicating non-trivial correlations in the eigenstates. Recent\u0000studies have shown that proteins, traditionally considered as insulators,\u0000exhibit much higher conductivity than previously assumed. In this paper, we\u0000investigate several proteins known for their efficient electron transport\u0000properties. We compare their energy level statistics, eigenfunction\u0000correlation, and electron return probability to those expected in metallic,\u0000insulating, or critical states. Remarkably, these proteins exhibit properties\u0000of critically disordered metals in their natural state without any parameter\u0000adjustment. Their composition and geometry are self-organized into the critical\u0000state of the Anderson transition, and their fractal properties are universal\u0000and unique among critical systems. Our findings suggest that proteins' wave\u0000functions fulfill \"holographic\" area laws, and the correlation fractal\u0000dimension is precisely $d_2=2$.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Exploiting Pre-trained Models for Drug Target Affinity Prediction with Nearest Neighbors","authors":"Qizhi Pei, Lijun Wu, Zhenyu He, Jinhua Zhu, Yingce Xia, Shufang Xie, Rui Yan","doi":"arxiv-2407.15202","DOIUrl":"https://doi.org/arxiv-2407.15202","url":null,"abstract":"Drug-Target binding Affinity (DTA) prediction is essential for drug\u0000discovery. Despite the application of deep learning methods to DTA prediction,\u0000the achieved accuracy remain suboptimal. In this work, inspired by the recent\u0000success of retrieval methods, we propose $k$NN-DTA, a non-parametric\u0000embedding-based retrieval method adopted on a pre-trained DTA prediction model,\u0000which can extend the power of the DTA model with no or negligible cost.\u0000Different from existing methods, we introduce two neighbor aggregation ways\u0000from both embedding space and label space that are integrated into a unified\u0000framework. Specifically, we propose a emph{label aggregation} with\u0000emph{pair-wise retrieval} and a emph{representation aggregation} with\u0000emph{point-wise retrieval} of the nearest neighbors. This method executes in\u0000the inference phase and can efficiently boost the DTA prediction performance\u0000with no training cost. In addition, we propose an extension, Ada-$k$NN-DTA, an\u0000instance-wise and adaptive aggregation with lightweight learning. Results on\u0000four benchmark datasets show that $k$NN-DTA brings significant improvements,\u0000outperforming previous state-of-the-art (SOTA) results, e.g, on BindingDB\u0000IC$_{50}$ and $K_i$ testbeds, $k$NN-DTA obtains new records of RMSE\u0000$bf{0.684}$ and $bf{0.750}$. The extended Ada-$k$NN-DTA further improves the\u0000performance to be $bf{0.675}$ and $bf{0.735}$ RMSE. These results strongly\u0000prove the effectiveness of our method. Results in other settings and\u0000comprehensive studies/analyses also show the great potential of our $k$NN-DTA\u0000approach.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Vineet Thumuluri, Peter Eckmann, Michael K. Gilson, Rose Yu
{"title":"Technical report: Improving the properties of molecules generated by LIMO","authors":"Vineet Thumuluri, Peter Eckmann, Michael K. Gilson, Rose Yu","doi":"arxiv-2407.14968","DOIUrl":"https://doi.org/arxiv-2407.14968","url":null,"abstract":"This technical report investigates variants of the Latent Inceptionism on\u0000Molecules (LIMO) framework to improve the properties of generated molecules. We\u0000conduct ablative studies of molecular representation, decoder model, and\u0000surrogate model training scheme. The experiments suggest that an autogressive\u0000Transformer decoder with GroupSELFIES achieves the best average properties for\u0000the random generation task.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Antonio Mirarchi, Toni Giorgino, Gianni De Fabritiis
{"title":"mdCATH: A Large-Scale MD Dataset for Data-Driven Computational Biophysics","authors":"Antonio Mirarchi, Toni Giorgino, Gianni De Fabritiis","doi":"arxiv-2407.14794","DOIUrl":"https://doi.org/arxiv-2407.14794","url":null,"abstract":"Recent advancements in protein structure determination are revolutionizing\u0000our understanding of proteins. Still, a significant gap remains in the\u0000availability of comprehensive datasets that focus on the dynamics of proteins,\u0000which are crucial for understanding protein function, folding, and\u0000interactions. To address this critical gap, we introduce mdCATH, a dataset\u0000generated through an extensive set of all-atom molecular dynamics simulations\u0000of a diverse and representative collection of protein domains. This dataset\u0000comprises all-atom systems for 5,398 domains, modeled with a state-of-the-art\u0000classical force field, and simulated in five replicates each at five\u0000temperatures from 320 K to 413 K. The mdCATH dataset records coordinates and\u0000forces every 1 ns, for over 62 ms of accumulated simulation time, effectively\u0000capturing the dynamics of the various classes of domains and providing a unique\u0000resource for proteome-wide statistical analyses of protein unfolding\u0000thermodynamics and kinetics. We outline the dataset structure and showcase its\u0000potential through four easily reproducible case studies, highlighting its\u0000capabilities in advancing protein science.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}