Ludovica Bono, Filippo Lunghini, Emanuela Sabato, Akash Deep Biswas, Angelica Mazzolari, Alessandro Pedretti, Andrea R. Beccari, Giulio Vistoli, Serena Vittorio
{"title":"Prediction of UGT-mediated phase II metabolism via ligand- and structure-based predictive models","authors":"Ludovica Bono, Filippo Lunghini, Emanuela Sabato, Akash Deep Biswas, Angelica Mazzolari, Alessandro Pedretti, Andrea R. Beccari, Giulio Vistoli, Serena Vittorio","doi":"10.1186/s13321-025-01097-y","DOIUrl":null,"url":null,"abstract":"The prediction of drugs metabolism by in silico techniques is gaining a growing interest due to the possibility to process large datasets allowing the stability and safety of new drug candidates to be evaluated during the early stages of the drug discovery process. To date, in silico models for metabolism prediction mainly exploits the ligand-based (LB) properties of the training molecules to predict the occurrence of a given metabolic reaction and/or the reactive site involved in the biotransformation. However, recent reports highlighted that structure-based (SB) modeling can be conveniently integrated with LB methods for drug metabolism prediction purpose, with the advantages to predict if a given molecule can fit the enzyme active site and which moiety approaches the catalytic residues. Herein, we developed machine learning models for UDP-glucuronosyltransferase (UGT)-mediated metabolism by using both LB and SB methods. In particular, this study was focused on UGT2B7 and UGT2B15 isoforms which are involved in the clearance of many drugs as well as in clinically relevant drug-drug interactions. First, molecular dynamics (MD) and docking simulations were combined to explore the binding mechanism of cofactor and substrate within the catalytic pocket of the studied UGT isoforms exploiting their AlphaFold structures. The analysis of the MD trajectories allowed an appropriate conformation of both UGT isoforms to be identified for the development of binary classification models. For this purpose, Random Forest algorithm and the metabolic data extracted from the MetaQSAR database were used. SB models were trained on a set of scoring functions and protein–ligand interaction fingerprints derived from docking, while the LB models were built on a set of physicochemical and constitutional descriptors. When the single models were evaluated, the LB classifiers outperformed the SB models. However, the application of a consensus strategy led to an improvement of the prediction accuracy if compared to the individual models, highlighting that LB and SB approaches convey complementary information whose aggregation allowed us to achieve better predictions than the single models. Metabolism prediction through in silico methods represents a useful tool to assess the pharmacokinetic profile of new drug candidates in the early stages of drug discovery. This study provides a new computational strategy to integrate ligand- and structure-based approaches for the prediction of UGT2B7 and UGT2B15-mediated metabolism exploiting their AlphaFold structures. The combination of both methodologies yielded enhanced performances in comparison to the individual ligand- and structure-based predictive models, also confirming the reliability of AlphaFold structures for developing structure-based models for metabolism prediction.","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"1 1","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1186/s13321-025-01097-y","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
The prediction of drugs metabolism by in silico techniques is gaining a growing interest due to the possibility to process large datasets allowing the stability and safety of new drug candidates to be evaluated during the early stages of the drug discovery process. To date, in silico models for metabolism prediction mainly exploits the ligand-based (LB) properties of the training molecules to predict the occurrence of a given metabolic reaction and/or the reactive site involved in the biotransformation. However, recent reports highlighted that structure-based (SB) modeling can be conveniently integrated with LB methods for drug metabolism prediction purpose, with the advantages to predict if a given molecule can fit the enzyme active site and which moiety approaches the catalytic residues. Herein, we developed machine learning models for UDP-glucuronosyltransferase (UGT)-mediated metabolism by using both LB and SB methods. In particular, this study was focused on UGT2B7 and UGT2B15 isoforms which are involved in the clearance of many drugs as well as in clinically relevant drug-drug interactions. First, molecular dynamics (MD) and docking simulations were combined to explore the binding mechanism of cofactor and substrate within the catalytic pocket of the studied UGT isoforms exploiting their AlphaFold structures. The analysis of the MD trajectories allowed an appropriate conformation of both UGT isoforms to be identified for the development of binary classification models. For this purpose, Random Forest algorithm and the metabolic data extracted from the MetaQSAR database were used. SB models were trained on a set of scoring functions and protein–ligand interaction fingerprints derived from docking, while the LB models were built on a set of physicochemical and constitutional descriptors. When the single models were evaluated, the LB classifiers outperformed the SB models. However, the application of a consensus strategy led to an improvement of the prediction accuracy if compared to the individual models, highlighting that LB and SB approaches convey complementary information whose aggregation allowed us to achieve better predictions than the single models. Metabolism prediction through in silico methods represents a useful tool to assess the pharmacokinetic profile of new drug candidates in the early stages of drug discovery. This study provides a new computational strategy to integrate ligand- and structure-based approaches for the prediction of UGT2B7 and UGT2B15-mediated metabolism exploiting their AlphaFold structures. The combination of both methodologies yielded enhanced performances in comparison to the individual ligand- and structure-based predictive models, also confirming the reliability of AlphaFold structures for developing structure-based models for metabolism prediction.
期刊介绍:
Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling.
Coverage includes, but is not limited to:
chemical information systems, software and databases, and molecular modelling,
chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases,
computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.