Advancing the development of deep learning and machine learning models for oral drugs through diverse descriptor classes: a focus on pharmacokinetic parameters (Vdss and PPB).
{"title":"Advancing the development of deep learning and machine learning models for oral drugs through diverse descriptor classes: a focus on pharmacokinetic parameters (Vdss and PPB).","authors":"Rakesh Bantu, Samiron Phukan, Simon Haydar","doi":"10.1007/s11030-025-11235-1","DOIUrl":null,"url":null,"abstract":"<p><p>In the present study, we report a predictive deep learning (DL) and machine learning (ML) model for pharmacokinetics (PK) parameters such as volume of distribution (Vdss) and plasma protein Binding (PPB). Using DL & ML algorithms our study provides a deeper and novel insights into the role of molecular descriptors in determining the PK parameters such as Vdss and PPB. FDA approved drugs with oral route of administration and having reported PK parameters were taken as the dataset. This was used for establishment of the foundational datasets followed by computation of different molecular descriptor classes. Feature engineering by Boruta algorithm exhibited significant increase in accuracy of the models. Features identified by Boruta algorithm, were trained for different models separately for both Vdss and PPB. The highest predictive scores amongst the models were achieved in gradient boosting (GB) and Stacking Classifier with 80% and 78% for Vdss. In the case of PPB, random forest and GB algorithm predicted the highest scores of 73% and 71%, respectively, in comparison to all other algorithms. In summary we report here appropriate ML algorithms like Stacking Classifier-by utilizing an unreported feature engineering algorithm -to predict Vdss and PPB individually considering over 67 descriptors each with ≥ 80% accuracy and 73% accuracy, respectively. Additionally, we developed models based on the shared descriptors between Vdss and PPB. Quantum chemical descriptors like MLFERs (MLFER_BH, MLFER_BO & MLFER_E) and topological descriptors like piPC5, piPC6, piPC9 & TpiPC identified as the common drivers of the functional activity of Vdss and PPB together.</p>","PeriodicalId":708,"journal":{"name":"Molecular Diversity","volume":" ","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Diversity","FirstCategoryId":"92","ListUrlMain":"https://doi.org/10.1007/s11030-025-11235-1","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
In the present study, we report a predictive deep learning (DL) and machine learning (ML) model for pharmacokinetics (PK) parameters such as volume of distribution (Vdss) and plasma protein Binding (PPB). Using DL & ML algorithms our study provides a deeper and novel insights into the role of molecular descriptors in determining the PK parameters such as Vdss and PPB. FDA approved drugs with oral route of administration and having reported PK parameters were taken as the dataset. This was used for establishment of the foundational datasets followed by computation of different molecular descriptor classes. Feature engineering by Boruta algorithm exhibited significant increase in accuracy of the models. Features identified by Boruta algorithm, were trained for different models separately for both Vdss and PPB. The highest predictive scores amongst the models were achieved in gradient boosting (GB) and Stacking Classifier with 80% and 78% for Vdss. In the case of PPB, random forest and GB algorithm predicted the highest scores of 73% and 71%, respectively, in comparison to all other algorithms. In summary we report here appropriate ML algorithms like Stacking Classifier-by utilizing an unreported feature engineering algorithm -to predict Vdss and PPB individually considering over 67 descriptors each with ≥ 80% accuracy and 73% accuracy, respectively. Additionally, we developed models based on the shared descriptors between Vdss and PPB. Quantum chemical descriptors like MLFERs (MLFER_BH, MLFER_BO & MLFER_E) and topological descriptors like piPC5, piPC6, piPC9 & TpiPC identified as the common drivers of the functional activity of Vdss and PPB together.
期刊介绍:
Molecular Diversity is a new publication forum for the rapid publication of refereed papers dedicated to describing the development, application and theory of molecular diversity and combinatorial chemistry in basic and applied research and drug discovery. The journal publishes both short and full papers, perspectives, news and reviews dealing with all aspects of the generation of molecular diversity, application of diversity for screening against alternative targets of all types (biological, biophysical, technological), analysis of results obtained and their application in various scientific disciplines/approaches including:
combinatorial chemistry and parallel synthesis;
small molecule libraries;
microwave synthesis;
flow synthesis;
fluorous synthesis;
diversity oriented synthesis (DOS);
nanoreactors;
click chemistry;
multiplex technologies;
fragment- and ligand-based design;
structure/function/SAR;
computational chemistry and molecular design;
chemoinformatics;
screening techniques and screening interfaces;
analytical and purification methods;
robotics, automation and miniaturization;
targeted libraries;
display libraries;
peptides and peptoids;
proteins;
oligonucleotides;
carbohydrates;
natural diversity;
new methods of library formulation and deconvolution;
directed evolution, origin of life and recombination;
search techniques, landscapes, random chemistry and more;