{"title":"MolDecor: Leveraging Transformers to Decorate Bioactive Molecules for Property Optimization.","authors":"Dibyajyoti Das,Sarveswara Rao Vangala,Arijit Roy","doi":"10.1021/acs.jcim.5c01151","DOIUrl":null,"url":null,"abstract":"Lead optimization is a critical stage in drug discovery, where promising molecules (lead molecules) are further optimized. It involves the refinement of the chemical structure of the lead molecule to improve its pharmacological properties and drug-like characteristics for development into potential therapies. In this study, we developed a pipeline that includes (a) the creation of a property-specific fragment (decorator) library, (b) learning fragment-scaffold relationship using a BERT-based transformer model, and (c) decorating a given scaffold using fragments from the generated fragment library for improving the properties of the lead molecule. This transformer-based model, MolDecor (Molecule Decorator), was trained on drug-like molecules to learn the optimal decorators for property optimization at single or multiple attachment points on the main scaffold of the lead molecule. The model was fine-tuned on specific property data sets like solubility and affinity using transfer learning to optimize these properties. In this study, an automated method was developed to generate a property-specific decorator library. By learning the relationship between scaffolds and decorators, the model avoids bias toward the most commonly used decorators. This also ensures the easy synthesizability of the generated molecules. The model was tested on the anticancer drug (Thalidomide), an antimalarial molecule (Compound 2), and the estrogen receptor modulator (Cyclofenil) to enhance solubility. Additionally, the model was applied to optimize the affinities of molecules targeting Janus kinase 1.","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"31 1","pages":""},"PeriodicalIF":5.3000,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Information and Modeling ","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1021/acs.jcim.5c01151","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
引用次数: 0
Abstract
Lead optimization is a critical stage in drug discovery, where promising molecules (lead molecules) are further optimized. It involves the refinement of the chemical structure of the lead molecule to improve its pharmacological properties and drug-like characteristics for development into potential therapies. In this study, we developed a pipeline that includes (a) the creation of a property-specific fragment (decorator) library, (b) learning fragment-scaffold relationship using a BERT-based transformer model, and (c) decorating a given scaffold using fragments from the generated fragment library for improving the properties of the lead molecule. This transformer-based model, MolDecor (Molecule Decorator), was trained on drug-like molecules to learn the optimal decorators for property optimization at single or multiple attachment points on the main scaffold of the lead molecule. The model was fine-tuned on specific property data sets like solubility and affinity using transfer learning to optimize these properties. In this study, an automated method was developed to generate a property-specific decorator library. By learning the relationship between scaffolds and decorators, the model avoids bias toward the most commonly used decorators. This also ensures the easy synthesizability of the generated molecules. The model was tested on the anticancer drug (Thalidomide), an antimalarial molecule (Compound 2), and the estrogen receptor modulator (Cyclofenil) to enhance solubility. Additionally, the model was applied to optimize the affinities of molecules targeting Janus kinase 1.
期刊介绍:
The Journal of Chemical Information and Modeling publishes papers reporting new methodology and/or important applications in the fields of chemical informatics and molecular modeling. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery.
Astute chemists, computer scientists, and information specialists look to this monthly’s insightful research studies, programming innovations, and software reviews to keep current with advances in this integral, multidisciplinary field.
As a subscriber you’ll stay abreast of database search systems, use of graph theory in chemical problems, substructure search systems, pattern recognition and clustering, analysis of chemical and physical data, molecular modeling, graphics and natural language interfaces, bibliometric and citation analysis, and synthesis design and reactions databases.