Siddartha Reddy N, Sai Prakash MV, Varun V, Vishal Vaddina, Saisubramaniam Gopalakrishnan
{"title":"Leveraging Latent Evolutionary Optimization for Targeted Molecule Generation","authors":"Siddartha Reddy N, Sai Prakash MV, Varun V, Vishal Vaddina, Saisubramaniam Gopalakrishnan","doi":"arxiv-2407.13779","DOIUrl":"https://doi.org/arxiv-2407.13779","url":null,"abstract":"Lead optimization is a pivotal task in the drug design phase within the drug\u0000discovery lifecycle. The primary objective is to refine the lead compound to\u0000meet specific molecular properties for progression to the subsequent phase of\u0000development. In this work, we present an innovative approach, Latent\u0000Evolutionary Optimization for Molecule Generation (LEOMol), a generative\u0000modeling framework for the efficient generation of optimized molecules. LEOMol\u0000leverages Evolutionary Algorithms, such as Genetic Algorithm and Differential\u0000Evolution, to search the latent space of a Variational AutoEncoder (VAE). This\u0000search facilitates the identification of the target molecule distribution\u0000within the latent space. Our approach consistently demonstrates superior\u0000performance compared to previous state-of-the-art models across a range of\u0000constrained molecule generation tasks, outperforming existing models in all\u0000four sub-tasks related to property targeting. Additionally, we suggest the\u0000importance of including toxicity in the evaluation of generative models.\u0000Furthermore, an ablation study underscores the improvements that our approach\u0000provides over gradient-based latent space optimization methods. This\u0000underscores the effectiveness and superiority of LEOMol in addressing the\u0000inherent challenges in constrained molecule generation while emphasizing its\u0000potential to propel advancements in drug discovery.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DrugCLIP: Contrastive Drug-Disease Interaction For Drug Repurposing","authors":"Yingzhou Lu, Yaojun Hu, Chenhao Li","doi":"arxiv-2407.02265","DOIUrl":"https://doi.org/arxiv-2407.02265","url":null,"abstract":"Bringing a novel drug from the original idea to market typically requires\u0000more than ten years and billions of dollars. To alleviate the heavy burden, a\u0000natural idea is to reuse the approved drug to treat new diseases. The process\u0000is also known as drug repurposing or drug repositioning. Machine learning\u0000methods exhibited huge potential in automating drug repurposing. However, it\u0000still encounter some challenges, such as lack of labels and multimodal feature\u0000representation. To address these issues, we design DrugCLIP, a cutting-edge\u0000contrastive learning method, to learn drug and disease's interaction without\u0000negative labels. Additionally, we have curated a drug repurposing dataset based\u0000on real-world clinical trial records. Thorough empirical studies are conducted\u0000to validate the effectiveness of the proposed DrugCLIP method.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"116 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141525286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AI-driven Alternative Medicine: A Novel Approach to Drug Discovery and Repurposing","authors":"Oleksandr Bilokon, Nataliya Bilokon, Paul Bilokon","doi":"arxiv-2407.02126","DOIUrl":"https://doi.org/arxiv-2407.02126","url":null,"abstract":"AIAltMed is a cutting-edge platform designed for drug discovery and\u0000repurposing. It utilizes Tanimoto similarity to identify structurally similar\u0000non-medicinal compounds to known medicinal ones. This preprint introduces\u0000AIAltMed, discusses the concept of `AI-driven alternative medicine,' evaluates\u0000Tanimoto similarity's advantages and limitations, and details the system's\u0000architecture. Furthermore, it explores the benefits of extending the system to\u0000include PubChem and outlines a corresponding implementation strategy.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"45 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141525285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FreeCG: Free the Design Space of Clebsch-Gordan Transform for machine learning force field","authors":"Shihao Shao, Haoran Geng, Qinghua Cui","doi":"arxiv-2407.02263","DOIUrl":"https://doi.org/arxiv-2407.02263","url":null,"abstract":"The Clebsch-Gordan Transform (CG transform) effectively encodes many-body\u0000interactions. Many studies have proven its accuracy in depicting atomic\u0000environments, although this comes with high computational needs. The\u0000computational burden of this challenge is hard to reduce due to the need for\u0000permutation equivariance, which limits the design space of the CG transform\u0000layer. We show that, implementing the CG transform layer on\u0000permutation-invariant inputs allows complete freedom in the design of this\u0000layer without affecting symmetry. Developing further on this premise, our idea\u0000is to create a CG transform layer that operates on permutation-invariant\u0000abstract edges generated from real edge information. We bring in group CG\u0000transform with sparse path, abstract edges shuffling, and attention enhancer to\u0000form a powerful and efficient CG transform layer. Our method, known as FreeCG,\u0000achieves State-of-The-Art (SoTA) results in force prediction for MD17, rMD17,\u0000MD22, and property prediction in QM9 datasets with notable enhancement. It\u0000introduces a novel paradigm for carrying out efficient and expressive CG\u0000transform in future geometric neural network designs.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"209 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141525289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ziyan Wang, Zhankun Xiong, Feng Huang, Xuan Liu, Wen Zhang
{"title":"ZeroDDI: A Zero-Shot Drug-Drug Interaction Event Prediction Method with Semantic Enhanced Learning and Dual-Modal Uniform Alignment","authors":"Ziyan Wang, Zhankun Xiong, Feng Huang, Xuan Liu, Wen Zhang","doi":"arxiv-2407.00891","DOIUrl":"https://doi.org/arxiv-2407.00891","url":null,"abstract":"Drug-drug interactions (DDIs) can result in various pharmacological changes,\u0000which can be categorized into different classes known as DDI events (DDIEs). In\u0000recent years, previously unobserved/unseen DDIEs have been emerging, posing a\u0000new classification task when unseen classes have no labelled instances in the\u0000training stage, which is formulated as a zero-shot DDIE prediction (ZS-DDIE)\u0000task. However, existing computational methods are not directly applicable to\u0000ZS-DDIE, which has two primary challenges: obtaining suitable DDIE\u0000representations and handling the class imbalance issue. To overcome these\u0000challenges, we propose a novel method named ZeroDDI for the ZS-DDIE task.\u0000Specifically, we design a biological semantic enhanced DDIE representation\u0000learning module, which emphasizes the key biological semantics and distills\u0000discriminative molecular substructure-related semantics for DDIE representation\u0000learning. Furthermore, we propose a dual-modal uniform alignment strategy to\u0000distribute drug pair representations and DDIE semantic representations\u0000uniformly in a unit sphere and align the matched ones, which can mitigate the\u0000issue of class imbalance. Extensive experiments showed that ZeroDDI surpasses\u0000the baselines and indicate that it is a promising tool for detecting unseen\u0000DDIEs. Our code has been released in https://github.com/wzy-Sarah/ZeroDDI.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141525288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jonathan W. P. Zajac, Praveen Muralikrishnan, Caryn L. Heldt, Sarah L. Perry, Sapna Sarupria
{"title":"Impact of Co-Excipient Selection on Hydrophobic Polymer Folding: Insights for Optimal Formulation Design","authors":"Jonathan W. P. Zajac, Praveen Muralikrishnan, Caryn L. Heldt, Sarah L. Perry, Sapna Sarupria","doi":"arxiv-2407.00885","DOIUrl":"https://doi.org/arxiv-2407.00885","url":null,"abstract":"The stabilization of liquid biological products is a complex task that\u0000depends on the chemical composition of both the active ingredient and any\u0000excipients in solution. Frequently, a large number of unique excipients are\u0000required to stabilize biologics, though it is not well-known how these\u0000excipients interact with one another. To probe these excipient-excipient\u0000interactions, we performed molecular dynamics simulations of arginine -- a\u0000widely used excipient with unique properties -- in solution either alone or\u0000with equimolar lysine or glutamate. We studied the effects of these mixtures on\u0000a hydrophobic polymer model to isolate excipient mechanisms on hydrophobic\u0000interactions, relevant to both protein folding and biomolecular self-assembly.\u0000We observed that arginine is the most effective single excipient in stabilizing\u0000hydrophobic polymer collapse, and its effectiveness can be augmented by lysine\u0000or glutamate addition. We utilized a decomposition of the potential of mean\u0000force to identify that the key source of arginine-lysine and arginine-glutamate\u0000synergy on polymer collapse is a reduction in attractive polymer-excipient\u0000direct interactions. Further, we applied principles from network theory to\u0000characterize the local solvent network that embeds the hydrophobic polymer.\u0000Through this approach, we found that arginine enables a more highly connected\u0000and stable network than in pure water, lysine, or glutamate solutions.\u0000Importantly, these network properties are preserved when lysine or glutamate\u0000are added to arginine solutions. Overall, we highlight the importance of\u0000identifying key molecular consequences of co-excipient selection, aiding in the\u0000establishment of rational formulation design rules.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"111 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141525290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Frontiers in integrative structural biology: modeling disordered proteins and utilizing in situ data","authors":"Kartik Majila, Shreyas Arvindekar, Muskaan Jindal, Shruthi Viswanath","doi":"arxiv-2407.00566","DOIUrl":"https://doi.org/arxiv-2407.00566","url":null,"abstract":"Integrative modeling enables structure determination for large macromolecular\u0000assemblies by combining data from multiple sources of experiment data with\u0000theoretical and computational predictions. Recent advancements in AI-based\u0000structure prediction and electron cryo-microscopy have sparked renewed\u0000enthusiasm for integrative modeling; structures from AI-based methods can be\u0000integrated with in situ maps to characterize large assemblies. This approach\u0000previously allowed us and others to determine the architectures of diverse\u0000macromolecular assemblies, such as nuclear pore complexes, chromatin\u0000remodelers, and cell-cell junctions. Experimental data spanning several scales\u0000was used in these studies, ranging from high-resolution data, such as X-ray\u0000crystallography and Alphafold structures, to low-resolution data, such as\u0000cryo-electron tomography maps and data from co-immunoprecipitation experiments.\u0000Two recurrent modeling challenges emerged across a range of studies. First,\u0000modeling disordered regions, which constituted a significant portion of these\u0000assemblies, necessitated the development of new methods. Second, methods needed\u0000to be developed to utilize the information from cryo-electron tomography, a\u0000timely challenge as structural biology is increasingly moving towards in situ\u0000characterization. Here, we recapitulate recent developments in the modeling of\u0000disordered proteins and the analysis of cryo-electron tomography data and\u0000highlight opportunities for method development in the context of integrative\u0000modeling.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"26 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"DCI: An Accurate Quality Assessment Criteria for Protein Complex Structure Models","authors":"Wenda Wang, Jiaqi Zhai, He Huang, Xinqi Gong","doi":"arxiv-2407.00560","DOIUrl":"https://doi.org/arxiv-2407.00560","url":null,"abstract":"The structure of proteins is the basis for studying protein function and drug\u0000design. The emergence of AlphaFold 2 has greatly promoted the prediction of\u0000protein 3D structures, and it is of great significance to give an overall and\u0000accurate evaluation of the predicted models, especially the complex models.\u0000Among the existing methods for evaluating multimer structures, DockQ is the\u0000most commonly used. However, as a more suitable metric for complex docking,\u0000DockQ cannot provide a unique and accurate evaluation in the non-docking\u0000situation. Therefore, it is necessary to propose an evaluation strategy that\u0000can directly evaluate the whole complex without limitation and achieve good\u0000results. In this work, we proposed DCI score, a new evaluation strategy for\u0000protein complex structure models, which only bases on distance map and CI\u0000(contact-interface) map, DCI focuses on the prediction accuracy of the contact\u0000interface based on the overall evaluation of complex structure, is not inferior\u0000to DockQ in the evaluation accuracy according to CAPRI classification, and is\u0000able to handle the non-docking situation better than DockQ. Besides, we\u0000calculated DCI score on CASP datasets and compared it with CASP official\u0000assessment, which obtained good results. In addition, we found that DCI can\u0000better evaluate the overall structure deviation caused by interface prediction\u0000errors in the case of multi-chains. Our DCI is available at\u0000url{https://gitee.com/WendaWang/DCI-score.git}, and the online-server is\u0000available at url{http://mialab.ruc.edu.cn/DCIServer/}.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"27 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141525287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"T- Hop: A framework for studying the importance path information in molecular graphs for chemical property prediction","authors":"Abdulrahman Ibraheem, Narsis Kiani, Jesper Tegner","doi":"arxiv-2407.14270","DOIUrl":"https://doi.org/arxiv-2407.14270","url":null,"abstract":"This paper studies the usefulness of incorporating path information in\u0000predicting chemical properties from molecular graphs, in the domain of QSAR\u0000(Quantitative Structure-Activity Relationship). Towards this, we developed a\u0000GNN-style model which can be toggled to operate in one of two modes: a\u0000non-degenerate mode which incorporates path information, and a degenerate mode\u0000which leaves out path information. Thus, by comparing the performance of the\u0000non-degenerate mode versus the degenerate mode on relevant QSAR datasets, we\u0000were able to directly assess the significance of path information on those\u0000datasets. Our results corroborate previous works, by suggesting that the\u0000usefulness of path information is datasetdependent. Unlike previous studies\u0000however, we took the very first steps towards building a model that could\u0000predict upfront whether or not path information would be useful for a given\u0000dataset at hand. Moreover, we also found that, albeit its simplicity, the\u0000degenerate mode of our model yielded rather surprising results, which\u0000outperformed more sophisticated SOTA models in certain cases.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"129 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141745035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Matan Halfon, Tomer Cohen, Raanan Fattal, Dina Schneidman-Duhovny
{"title":"ContactNet: Geometric-Based Deep Learning Model for Predicting Protein-Protein Interactions","authors":"Matan Halfon, Tomer Cohen, Raanan Fattal, Dina Schneidman-Duhovny","doi":"arxiv-2406.18314","DOIUrl":"https://doi.org/arxiv-2406.18314","url":null,"abstract":"Deep learning approaches achieved significant progress in predicting protein\u0000structures. These methods are often applied to protein-protein interactions\u0000(PPIs) yet require Multiple Sequence Alignment (MSA) which is unavailable for\u0000various interactions, such as antibody-antigen. Computational docking methods\u0000are capable of sampling accurate complex models, but also produce thousands of\u0000invalid configurations. The design of scoring functions for identifying\u0000accurate models is a long-standing challenge. We develop a novel\u0000attention-based Graph Neural Network (GNN), ContactNet, for classifying PPI\u0000models obtained from docking algorithms into accurate and incorrect ones. When\u0000trained on docked antigen and modeled antibody structures, ContactNet doubles\u0000the accuracy of current state-of-the-art scoring functions, achieving accurate\u0000models among its Top-10 at 43% of the test cases. When applied to unbound\u0000antibodies, its Top-10 accuracy increases to 65%. This performance is achieved\u0000without MSA and the approach is applicable to other types of interactions, such\u0000as host-pathogens or general PPIs.","PeriodicalId":501022,"journal":{"name":"arXiv - QuanBio - Biomolecules","volume":"2015 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141500560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}