arXiv - QuanBio - Biomolecules最新文献

筛选
英文 中文
Synthetic High-resolution Cryo-EM Density Maps with Generative Adversarial Networks 利用生成式对抗网络合成高分辨率冷冻电镜密度图
arXiv - QuanBio - Biomolecules Pub Date : 2024-07-24 DOI: arxiv-2407.17674
Chenwei Zhang, Anne Condon, Khanh Dao Duc
{"title":"Synthetic High-resolution Cryo-EM Density Maps with Generative Adversarial Networks","authors":"Chenwei Zhang, Anne Condon, Khanh Dao Duc","doi":"arxiv-2407.17674","DOIUrl":"https://doi.org/arxiv-2407.17674","url":null,"abstract":"Generating synthetic cryogenic electron microscopy (cryo-EM) 3D density maps\u0000from molecular structures has potential important applications in structural\u0000biology. Yet existing simulation-based methods cannot mimic all the complex\u0000features present in experimental maps, such as secondary structure elements. As\u0000an alternative, we propose struc2mapGAN, a novel data-driven method that\u0000employs a generative adversarial network (GAN) to produce high-resolution\u0000experimental-like density maps from molecular structures. More specifically,\u0000struc2mapGAN uses a U-Net++ architecture as the generator, with an additional\u0000L1 loss term and further processing of raw experimental maps to enhance\u0000learning efficiency. While struc2mapGAN can promptly generate maps after\u0000training, we demonstrate that it outperforms existing simulation-based methods\u0000for a wide array of tested maps and across various evaluation metrics. Our code\u0000is available at https://github.com/chenwei-zhang/struc2mapGAN.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"42 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ranking protein-protein models with large language models and graph neural networks 利用大型语言模型和图神经网络对蛋白质-蛋白质模型进行排序
arXiv - QuanBio - Biomolecules Pub Date : 2024-07-23 DOI: arxiv-2407.16375
Xiaotong Xu, Alexandre M. J. J. Bonvin
{"title":"Ranking protein-protein models with large language models and graph neural networks","authors":"Xiaotong Xu, Alexandre M. J. J. Bonvin","doi":"arxiv-2407.16375","DOIUrl":"https://doi.org/arxiv-2407.16375","url":null,"abstract":"Protein-protein interactions (PPIs) are associated with various diseases,\u0000including cancer, infections, and neurodegenerative disorders. Obtaining\u0000three-dimensional structural information on these PPIs serves as a foundation\u0000to interfere with those or to guide drug design. Various strategies can be\u0000followed to model those complexes, all typically resulting in a large number of\u0000models. A challenging step in this process is the identification of good models\u0000(near-native PPI conformations) from the large pool of generated models. To\u0000address this challenge, we previously developed DeepRank-GNN-esm, a graph-based\u0000deep learning algorithm for ranking modelled PPI structures harnessing the\u0000power of protein language models. Here, we detail the use of our software with\u0000examples. DeepRank-GNN-esm is freely available at\u0000https://github.com/haddocking/DeepRank-GNN-esm","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"18 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessment of scoring functions for computational models of protein-protein interfaces 评估蛋白质-蛋白质界面计算模型的评分函数
arXiv - QuanBio - Biomolecules Pub Date : 2024-07-23 DOI: arxiv-2407.16580
Jacob Sumner, Grace Meng, Naomi Brandt, Alex T. Grigas, Andrés Córdoba, Mark D. Shattuck, Corey S. O'Hern
{"title":"Assessment of scoring functions for computational models of protein-protein interfaces","authors":"Jacob Sumner, Grace Meng, Naomi Brandt, Alex T. Grigas, Andrés Córdoba, Mark D. Shattuck, Corey S. O'Hern","doi":"arxiv-2407.16580","DOIUrl":"https://doi.org/arxiv-2407.16580","url":null,"abstract":"A goal of computational studies of protein-protein interfaces (PPIs) is to\u0000predict the binding site between two monomers that form a heterodimer. The\u0000simplest version of this problem is to rigidly re-dock the bound forms of the\u0000monomers, which involves generating computational models of the heterodimer and\u0000then scoring them to determine the most native-like models. Scoring functions\u0000have been assessed previously using rank- and classification-based metrics,\u0000however, these methods are sensitive to the number and quality of models in the\u0000scoring function training set. We assess the accuracy of seven PPI scoring\u0000functions by comparing their scores to a measure of structural similarity to\u0000the x-ray crystal structure (i.e. the DockQ score) for a non-redundant set of\u0000heterodimers from the Protein Data Bank. For each heterodimer, we generate\u0000re-docked models uniformly sampled over DockQ and calculate the Spearman\u0000correlation between the PPI scores and DockQ. For some targets, the scores and\u0000DockQ are highly correlated; however, for many targets, there are weak\u0000correlations. Several physical features can explain the difference between\u0000difficult- and easy-to-score targets. For example, strong correlations exist\u0000between the score and DockQ for targets with highly intertwined monomers and\u0000many interface contacts. We also develop a new score based on only three\u0000physical features that matches or exceeds the performance of current PPI\u0000scoring functions. These results emphasize that PPI prediction can be improved\u0000by focusing on correlations between the PPI score and DockQ and incorporating\u0000more discriminating physical features into PPI scoring functions.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The need to implement FAIR principles in biomolecular simulations 在生物分子模拟中贯彻 FAIR 原则的必要性
arXiv - QuanBio - Biomolecules Pub Date : 2024-07-23 DOI: arxiv-2407.16584
Rommie Amaro, Johan Åqvist, Ivet Bahar, Federica Battistini, Adam Bellaiche, Daniel Beltran, Philip C. Biggin, Massimiliano Bonomi, Gregory R. Bowman, Richard Bryce, Giovanni Bussi, Paolo Carloni, David Case, Andrea Cavalli, Chie-En A. Chang, Thomas E. Cheatham III, Margaret S. Cheung, Cris Chipot, Lillian T. Chong, Preeti Choudhary, Cecilia Clementi, Rosana Collepardo-Guevara, Peter Coveney, T. Daniel Crawford, Matteo Dal Peraro, Bert de Groot, Lucie Delemotte, Marco De Vivo, Jonathan Essex, Franca Fraternali, Jiali Gao, Josep Lluís Gelpí, Francesco Luigi Gervasio, Fernando Danilo Gonzalez-Nilo, Helmut Grubmüller, Marina Guenza, Horacio V. Guzman, Sarah Harris, Teresa Head-Gordon, Rigoberto Hernandez, Adam Hospital, Niu Huang, Xuhui Huang, Gerhard Hummer, Javier Iglesias-Fernández, Jan H. Jensen, Shantenu Jha, Wanting Jiao, Shina Caroline Lynn Kamerlin, Syma Khalid, Charles Laughton, Michael Levitt, Vittorio Limongelli, Erik Lindahl, Kersten Lindorff-Larsen, Sharon Loverde, Magnus Lundborg, Yun Lina Luo, Francisco Javier Luque, Charlotte I. Lynch, Alexander MacKerell, Alessandra Magistrato, Siewert J. Marrink, Hugh Martin, J. Andrew McCammon, Kenneth Merz, Vicent Moliner, Adrian Mulholland, Sohail Murad, Athi N. Naganathan, Shikha Nangia, Frank Noe, Agnes Noy, Julianna Oláh, Megan O'Mara, Mary Jo Ondrechen, José N. Onuchic, Alexey Onufriev, Silvia Osuna, Anna R. Panchenko, Sergio Pantano, Michele Parrinello, Alberto Perez, Tomas Perez-Acle, Juan R. Perilla, B. Montgomery Pettitt, Adriana Pietropalo, Jean-Philip Piquemal, Adolfo Poma, Matej Praprotnik, Maria J. Ramos, Pengyu Ren, Nathalie Reuter, Adrian Roitberg, Edina Rosta, Carme Rovira, Benoit Roux, Ursula Röthlisberger, Karissa Y. Sanbonmatsu, Tamar Schlick, Alexey K. Shaytan, Carlos Simmerling, Jeremy C. Smith, Yuji Sugita, Katarzyna Świderek, Makoto Taiji, Peng Tao, Julian Tirado-Rives, Inaki Tunón, Marc W. Van Der Kamp, David Van der Spoel, Sameer Velankar, Gregory A. Voth, Rebecca Wade, Ariel Warshel, Valerie Vaissier Welborn, Stacey Wetmore, Chung F. Wong, Lee-Wei Yang, Martin Zacharias, Modesto Orozco
{"title":"The need to implement FAIR principles in biomolecular simulations","authors":"Rommie Amaro, Johan Åqvist, Ivet Bahar, Federica Battistini, Adam Bellaiche, Daniel Beltran, Philip C. Biggin, Massimiliano Bonomi, Gregory R. Bowman, Richard Bryce, Giovanni Bussi, Paolo Carloni, David Case, Andrea Cavalli, Chie-En A. Chang, Thomas E. Cheatham III, Margaret S. Cheung, Cris Chipot, Lillian T. Chong, Preeti Choudhary, Cecilia Clementi, Rosana Collepardo-Guevara, Peter Coveney, T. Daniel Crawford, Matteo Dal Peraro, Bert de Groot, Lucie Delemotte, Marco De Vivo, Jonathan Essex, Franca Fraternali, Jiali Gao, Josep Lluís Gelpí, Francesco Luigi Gervasio, Fernando Danilo Gonzalez-Nilo, Helmut Grubmüller, Marina Guenza, Horacio V. Guzman, Sarah Harris, Teresa Head-Gordon, Rigoberto Hernandez, Adam Hospital, Niu Huang, Xuhui Huang, Gerhard Hummer, Javier Iglesias-Fernández, Jan H. Jensen, Shantenu Jha, Wanting Jiao, Shina Caroline Lynn Kamerlin, Syma Khalid, Charles Laughton, Michael Levitt, Vittorio Limongelli, Erik Lindahl, Kersten Lindorff-Larsen, Sharon Loverde, Magnus Lundborg, Yun Lina Luo, Francisco Javier Luque, Charlotte I. Lynch, Alexander MacKerell, Alessandra Magistrato, Siewert J. Marrink, Hugh Martin, J. Andrew McCammon, Kenneth Merz, Vicent Moliner, Adrian Mulholland, Sohail Murad, Athi N. Naganathan, Shikha Nangia, Frank Noe, Agnes Noy, Julianna Oláh, Megan O'Mara, Mary Jo Ondrechen, José N. Onuchic, Alexey Onufriev, Silvia Osuna, Anna R. Panchenko, Sergio Pantano, Michele Parrinello, Alberto Perez, Tomas Perez-Acle, Juan R. Perilla, B. Montgomery Pettitt, Adriana Pietropalo, Jean-Philip Piquemal, Adolfo Poma, Matej Praprotnik, Maria J. Ramos, Pengyu Ren, Nathalie Reuter, Adrian Roitberg, Edina Rosta, Carme Rovira, Benoit Roux, Ursula Röthlisberger, Karissa Y. Sanbonmatsu, Tamar Schlick, Alexey K. Shaytan, Carlos Simmerling, Jeremy C. Smith, Yuji Sugita, Katarzyna Świderek, Makoto Taiji, Peng Tao, Julian Tirado-Rives, Inaki Tunón, Marc W. Van Der Kamp, David Van der Spoel, Sameer Velankar, Gregory A. Voth, Rebecca Wade, Ariel Warshel, Valerie Vaissier Welborn, Stacey Wetmore, Chung F. Wong, Lee-Wei Yang, Martin Zacharias, Modesto Orozco","doi":"arxiv-2407.16584","DOIUrl":"https://doi.org/arxiv-2407.16584","url":null,"abstract":"This letter illustrates the opinion of the molecular dynamics (MD) community\u0000on the need to adopt a new FAIR paradigm for the use of molecular simulations.\u0000It highlights the necessity of a collaborative effort to create, establish, and\u0000sustain a database that allows findability, accessibility, interoperability,\u0000and reusability of molecular dynamics simulation data. Such a development would\u0000democratize the field and significantly improve the impact of MD simulations on\u0000life science research. This will transform our working paradigm, pushing the\u0000field to a new frontier. We invite you to support our initiative at the MDDB\u0000community (https://mddbr.eu/community/)","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"48 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Molecular design for cardiac cell differentiation using a small dataset and decorated shape features 利用小型数据集和装饰形状特征进行心脏细胞分化的分子设计
arXiv - QuanBio - Biomolecules Pub Date : 2024-07-22 DOI: arxiv-2407.15322
Fatemeh Etezadi, Shunichi Ito, Kosuke Yasui, Rodi Kado Abdalkader, Itsunari Minami, Motonari Uesugi, Ganesh Pandian Namasivayam, Haruko Nakano, Atsushi Nakano, Daniel M. Packwood
{"title":"Molecular design for cardiac cell differentiation using a small dataset and decorated shape features","authors":"Fatemeh Etezadi, Shunichi Ito, Kosuke Yasui, Rodi Kado Abdalkader, Itsunari Minami, Motonari Uesugi, Ganesh Pandian Namasivayam, Haruko Nakano, Atsushi Nakano, Daniel M. Packwood","doi":"arxiv-2407.15322","DOIUrl":"https://doi.org/arxiv-2407.15322","url":null,"abstract":"The discovery of small organic compounds for inducing stem cell\u0000differentiation is a time- and resource-intensive process. While data science\u0000could, in principle, facilitate the discovery of these compounds, novel\u0000approaches are required due to the difficulty of acquiring training data from\u0000large numbers of example compounds. In this paper, we demonstrate the design of\u0000a new compound for inducing cardiomyocyte differentiation using simple\u0000regression models trained with a data set containing only 80 examples. We\u0000introduce decorated shape descriptors, an information-rich molecular feature\u0000representation that integrates both molecular shape and hydrophilicity\u0000information. These models demonstrate improved performance compared to ones\u0000using standard molecular descriptors based on shape alone. Model overtraining\u0000is diagnosed using a new type of sensitivity analysis. Our new compound is\u0000designed using a conservative molecular design strategy, and its effectiveness\u0000is confirmed through expression profiles of cardiomyocyte-related marker genes\u0000using real-time polymerase chain reaction experiments on human iPS cell lines.\u0000This work demonstrates a viable data-driven strategy for designing new\u0000compounds for stem cell differentiation protocols and will be useful in\u0000situations where training data is limited.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"70 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Holographic nature of critical quantum states of proteins 蛋白质临界量子态的全息性
arXiv - QuanBio - Biomolecules Pub Date : 2024-07-21 DOI: arxiv-2407.15101
Eszter Papp, Gabor Vattay
{"title":"Holographic nature of critical quantum states of proteins","authors":"Eszter Papp, Gabor Vattay","doi":"arxiv-2407.15101","DOIUrl":"https://doi.org/arxiv-2407.15101","url":null,"abstract":"The Anderson metal-insulator transition is a fundamental phenomenon in\u0000condensed matter physics, describing the transition from a conducting\u0000(metallic) to a non-conducting (insulating) state driven by disorder in a\u0000material. At the critical point of the Anderson transition, wave functions\u0000exhibit multifractal behavior, and energy levels display a universal\u0000distribution, indicating non-trivial correlations in the eigenstates. Recent\u0000studies have shown that proteins, traditionally considered as insulators,\u0000exhibit much higher conductivity than previously assumed. In this paper, we\u0000investigate several proteins known for their efficient electron transport\u0000properties. We compare their energy level statistics, eigenfunction\u0000correlation, and electron return probability to those expected in metallic,\u0000insulating, or critical states. Remarkably, these proteins exhibit properties\u0000of critically disordered metals in their natural state without any parameter\u0000adjustment. Their composition and geometry are self-organized into the critical\u0000state of the Anderson transition, and their fractal properties are universal\u0000and unique among critical systems. Our findings suggest that proteins' wave\u0000functions fulfill \"holographic\" area laws, and the correlation fractal\u0000dimension is precisely $d_2=2$.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploiting Pre-trained Models for Drug Target Affinity Prediction with Nearest Neighbors 利用预训练模型进行药物靶点亲和性近邻预测
arXiv - QuanBio - Biomolecules Pub Date : 2024-07-21 DOI: arxiv-2407.15202
Qizhi Pei, Lijun Wu, Zhenyu He, Jinhua Zhu, Yingce Xia, Shufang Xie, Rui Yan
{"title":"Exploiting Pre-trained Models for Drug Target Affinity Prediction with Nearest Neighbors","authors":"Qizhi Pei, Lijun Wu, Zhenyu He, Jinhua Zhu, Yingce Xia, Shufang Xie, Rui Yan","doi":"arxiv-2407.15202","DOIUrl":"https://doi.org/arxiv-2407.15202","url":null,"abstract":"Drug-Target binding Affinity (DTA) prediction is essential for drug\u0000discovery. Despite the application of deep learning methods to DTA prediction,\u0000the achieved accuracy remain suboptimal. In this work, inspired by the recent\u0000success of retrieval methods, we propose $k$NN-DTA, a non-parametric\u0000embedding-based retrieval method adopted on a pre-trained DTA prediction model,\u0000which can extend the power of the DTA model with no or negligible cost.\u0000Different from existing methods, we introduce two neighbor aggregation ways\u0000from both embedding space and label space that are integrated into a unified\u0000framework. Specifically, we propose a emph{label aggregation} with\u0000emph{pair-wise retrieval} and a emph{representation aggregation} with\u0000emph{point-wise retrieval} of the nearest neighbors. This method executes in\u0000the inference phase and can efficiently boost the DTA prediction performance\u0000with no training cost. In addition, we propose an extension, Ada-$k$NN-DTA, an\u0000instance-wise and adaptive aggregation with lightweight learning. Results on\u0000four benchmark datasets show that $k$NN-DTA brings significant improvements,\u0000outperforming previous state-of-the-art (SOTA) results, e.g, on BindingDB\u0000IC$_{50}$ and $K_i$ testbeds, $k$NN-DTA obtains new records of RMSE\u0000$bf{0.684}$ and $bf{0.750}$. The extended Ada-$k$NN-DTA further improves the\u0000performance to be $bf{0.675}$ and $bf{0.735}$ RMSE. These results strongly\u0000prove the effectiveness of our method. Results in other settings and\u0000comprehensive studies/analyses also show the great potential of our $k$NN-DTA\u0000approach.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Technical report: Improving the properties of molecules generated by LIMO 技术报告:改进 LIMO 生成的分子的特性
arXiv - QuanBio - Biomolecules Pub Date : 2024-07-20 DOI: arxiv-2407.14968
Vineet Thumuluri, Peter Eckmann, Michael K. Gilson, Rose Yu
{"title":"Technical report: Improving the properties of molecules generated by LIMO","authors":"Vineet Thumuluri, Peter Eckmann, Michael K. Gilson, Rose Yu","doi":"arxiv-2407.14968","DOIUrl":"https://doi.org/arxiv-2407.14968","url":null,"abstract":"This technical report investigates variants of the Latent Inceptionism on\u0000Molecules (LIMO) framework to improve the properties of generated molecules. We\u0000conduct ablative studies of molecular representation, decoder model, and\u0000surrogate model training scheme. The experiments suggest that an autogressive\u0000Transformer decoder with GroupSELFIES achieves the best average properties for\u0000the random generation task.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
mdCATH: A Large-Scale MD Dataset for Data-Driven Computational Biophysics mdCATH:用于数据驱动计算生物物理学的大规模 MD 数据集
arXiv - QuanBio - Biomolecules Pub Date : 2024-07-20 DOI: arxiv-2407.14794
Antonio Mirarchi, Toni Giorgino, Gianni De Fabritiis
{"title":"mdCATH: A Large-Scale MD Dataset for Data-Driven Computational Biophysics","authors":"Antonio Mirarchi, Toni Giorgino, Gianni De Fabritiis","doi":"arxiv-2407.14794","DOIUrl":"https://doi.org/arxiv-2407.14794","url":null,"abstract":"Recent advancements in protein structure determination are revolutionizing\u0000our understanding of proteins. Still, a significant gap remains in the\u0000availability of comprehensive datasets that focus on the dynamics of proteins,\u0000which are crucial for understanding protein function, folding, and\u0000interactions. To address this critical gap, we introduce mdCATH, a dataset\u0000generated through an extensive set of all-atom molecular dynamics simulations\u0000of a diverse and representative collection of protein domains. This dataset\u0000comprises all-atom systems for 5,398 domains, modeled with a state-of-the-art\u0000classical force field, and simulated in five replicates each at five\u0000temperatures from 320 K to 413 K. The mdCATH dataset records coordinates and\u0000forces every 1 ns, for over 62 ms of accumulated simulation time, effectively\u0000capturing the dynamics of the various classes of domains and providing a unique\u0000resource for proteome-wide statistical analyses of protein unfolding\u0000thermodynamics and kinetics. We outline the dataset structure and showcase its\u0000potential through four easily reproducible case studies, highlighting its\u0000capabilities in advancing protein science.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141781690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decomposed Direct Preference Optimization for Structure-Based Drug Design 基于结构的药物设计的分解直接偏好优化
arXiv - QuanBio - Biomolecules Pub Date : 2024-07-19 DOI: arxiv-2407.13981
Xiwei Cheng, Xiangxin Zhou, Yuwei Yang, Yu Bao, Quanquan Gu
{"title":"Decomposed Direct Preference Optimization for Structure-Based Drug Design","authors":"Xiwei Cheng, Xiangxin Zhou, Yuwei Yang, Yu Bao, Quanquan Gu","doi":"arxiv-2407.13981","DOIUrl":"https://doi.org/arxiv-2407.13981","url":null,"abstract":"Diffusion models have achieved promising results for Structure-Based Drug\u0000Design (SBDD). Nevertheless, high-quality protein subpocket and ligand data are\u0000relatively scarce, which hinders the models' generation capabilities. Recently,\u0000Direct Preference Optimization (DPO) has emerged as a pivotal tool for the\u0000alignment of generative models such as large language models and diffusion\u0000models, providing greater flexibility and accuracy by directly aligning model\u0000outputs with human preferences. Building on this advancement, we introduce DPO\u0000to SBDD in this paper. We tailor diffusion models to pharmaceutical needs by\u0000aligning them with elaborately designed chemical score functions. We propose a\u0000new structure-based molecular optimization method called DecompDPO, which\u0000decomposes the molecule into arms and scaffolds and performs preference\u0000optimization at both local substructure and global molecule levels, allowing\u0000for more precise control with fine-grained preferences. Notably, DecompDPO can\u0000be effectively used for two main purposes: (1) fine-tuning pretrained diffusion\u0000models for molecule generation across various protein families, and (2)\u0000molecular optimization given a specific protein subpocket after generation.\u0000Extensive experiments on the CrossDocked2020 benchmark show that DecompDPO\u0000significantly improves model performance in both molecule generation and\u0000optimization, with up to 100% Median High Affinity and a 54.9% Success Rate.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"47 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信