PKSmart：一个开源的计算模型，用于预测静脉小分子的药代动力学

IF 5.7 2区化学 Q1 CHEMISTRY, MULTIDISCIPLINARY

Journal of Cheminformatics Pub Date : 2025-09-26 DOI:10.1186/s13321-025-01066-5

Srijit Seal, Maria-Anna Trapotsi, Manas Mahale, Vigneshwari Subramanian, Nigel Greene, Ola Spjuth, Andreas Bender

{"title":"PKSmart：一个开源的计算模型，用于预测静脉小分子的药代动力学","authors":"Srijit Seal, Maria-Anna Trapotsi, Manas Mahale, Vigneshwari Subramanian, Nigel Greene, Ola Spjuth, Andreas Bender","doi":"10.1186/s13321-025-01066-5","DOIUrl":null,"url":null,"abstract":"Drug exposure, a key determinant of drug safety and efficacy, is governed by pharmacokinetic (PK) parameters such as volume of distribution (VDss), clearance (CL), half-life (t½), fraction unbound in plasma (fu), and mean residence time (MRT). In this study, we developed machine learning models to predict human PK parameters for 1,283 unique compounds using molecular structure, physicochemical properties, and predicted animal PK data. Our approach involved a two-stage modeling pipeline. First, we trained models to predict rat, dog, and monkey PK parameters (VDss, CL, fu) from chemical structure and properties for 371 compounds. These models were used to predict animal PK values for 1,283 unique compounds with human PK data. These animal PK predictions were then integrated with molecular descriptors and fingerprints to build Random Forest models for human PK parameters. The models demonstrated consistent performance across nested cross-validation and external validation sets, with predictive accuracy for VDss comparable to proprietary models developed by AstraZeneca. Notably, human VDss and CL predictions achieved external R2 values of 0.39 and 0.46, respectively. To support broad accessibility and integration into early drug discovery workflows such as Design-Make-Test-Analyze (DMTA), we developed PKSmart (https://broad.io/PKSmart), a freely available web application. All code and models are also open source, enabling local deployment. To our knowledge, this represents the first public suite of PK prediction models with performance on par with industry standard models.This study introduces the first publicly available pharmacokinetic (PK) models that match industry-standard predictions, utilizing molecular structural fingerprints, physicochemical properties, and predicted animal PK data to model human pharmacokinetics. Our approach is validated through repeated nested cross-validation and an external test set, including comparing predictions to an industry standard model. The models are released via a web-hosted application (https://broad.io/PKSmart) for wider accessibility and utility in drug development processes.","PeriodicalId":617,"journal":{"name":"Journal of Cheminformatics","volume":"17 1","pages":""},"PeriodicalIF":5.7000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01066-5","citationCount":"0","resultStr":"{\"title\":\"PKSmart: an open-source computational model to predict intravenous pharmacokinetics of small molecules\",\"authors\":\"Srijit Seal, Maria-Anna Trapotsi, Manas Mahale, Vigneshwari Subramanian, Nigel Greene, Ola Spjuth, Andreas Bender\",\"doi\":\"10.1186/s13321-025-01066-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Drug exposure, a key determinant of drug safety and efficacy, is governed by pharmacokinetic (PK) parameters such as volume of distribution (VDss), clearance (CL), half-life (t½), fraction unbound in plasma (fu), and mean residence time (MRT). In this study, we developed machine learning models to predict human PK parameters for 1,283 unique compounds using molecular structure, physicochemical properties, and predicted animal PK data. Our approach involved a two-stage modeling pipeline. First, we trained models to predict rat, dog, and monkey PK parameters (VDss, CL, fu) from chemical structure and properties for 371 compounds. These models were used to predict animal PK values for 1,283 unique compounds with human PK data. These animal PK predictions were then integrated with molecular descriptors and fingerprints to build Random Forest models for human PK parameters. The models demonstrated consistent performance across nested cross-validation and external validation sets, with predictive accuracy for VDss comparable to proprietary models developed by AstraZeneca. Notably, human VDss and CL predictions achieved external R2 values of 0.39 and 0.46, respectively. To support broad accessibility and integration into early drug discovery workflows such as Design-Make-Test-Analyze (DMTA), we developed PKSmart (https://broad.io/PKSmart), a freely available web application. All code and models are also open source, enabling local deployment. To our knowledge, this represents the first public suite of PK prediction models with performance on par with industry standard models.This study introduces the first publicly available pharmacokinetic (PK) models that match industry-standard predictions, utilizing molecular structural fingerprints, physicochemical properties, and predicted animal PK data to model human pharmacokinetics. Our approach is validated through repeated nested cross-validation and an external test set, including comparing predictions to an industry standard model. The models are released via a web-hosted application (https://broad.io/PKSmart) for wider accessibility and utility in drug development processes.\",\"PeriodicalId\":617,\"journal\":{\"name\":\"Journal of Cheminformatics\",\"volume\":\"17 1\",\"pages\":\"\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2025-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://jcheminf.biomedcentral.com/counter/pdf/10.1186/s13321-025-01066-5\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Cheminformatics\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://link.springer.com/article/10.1186/s13321-025-01066-5\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cheminformatics","FirstCategoryId":"92","ListUrlMain":"https://link.springer.com/article/10.1186/s13321-025-01066-5","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

药物暴露是药物安全性和有效性的关键决定因素，受药代动力学（PK）参数的控制，如分布体积（VDss）、清除率（CL）、半衰期（t½）、血浆中未结合分数（fu）和平均停留时间（MRT）。在这项研究中，我们开发了机器学习模型，利用分子结构、物理化学性质和预测动物PK数据来预测1,283种独特化合物的人类PK参数。我们的方法包括一个两阶段的建模管道。首先，我们训练模型从371种化合物的化学结构和性质来预测大鼠、狗和猴子的PK参数（VDss、CL、fu）。这些模型被用来预测1283种独特化合物与人类PK数据的动物PK值。然后将这些动物PK预测与分子描述符和指纹相结合，构建人类PK参数的随机森林模型。该模型在嵌套交叉验证和外部验证集中表现出一致的性能，其VDss的预测精度可与阿斯利康开发的专有模型相媲美。值得注意的是，人类VDss和CL预测的外部R2值分别为0.39和0.46。为了支持广泛的可访问性并集成到早期药物发现工作流程中，例如设计-制造-测试-分析（DMTA），我们开发了PKSmart (https://broad.io/PKSmart)，这是一个免费的web应用程序。所有的代码和模型也是开源的，支持本地部署。据我们所知，这是第一个公开的PK预测模型套件，其性能与行业标准模型相当。本研究引入了第一个公开可用的药代动力学（PK）模型，该模型符合行业标准预测，利用分子结构指纹图谱、物理化学性质和预测的动物PK数据来模拟人类药代动力学。我们的方法通过重复嵌套交叉验证和外部测试集得到验证，包括将预测与行业标准模型进行比较。这些模型通过网络托管应用程序（https://broad.io/PKSmart）发布，以便在药物开发过程中更广泛地访问和使用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

PKSmart: an open-source computational model to predict intravenous pharmacokinetics of small molecules

Drug exposure, a key determinant of drug safety and efficacy, is governed by pharmacokinetic (PK) parameters such as volume of distribution (VDss), clearance (CL), half-life (t½), fraction unbound in plasma (fu), and mean residence time (MRT). In this study, we developed machine learning models to predict human PK parameters for 1,283 unique compounds using molecular structure, physicochemical properties, and predicted animal PK data. Our approach involved a two-stage modeling pipeline. First, we trained models to predict rat, dog, and monkey PK parameters (VDss, CL, fu) from chemical structure and properties for 371 compounds. These models were used to predict animal PK values for 1,283 unique compounds with human PK data. These animal PK predictions were then integrated with molecular descriptors and fingerprints to build Random Forest models for human PK parameters. The models demonstrated consistent performance across nested cross-validation and external validation sets, with predictive accuracy for VDss comparable to proprietary models developed by AstraZeneca. Notably, human VDss and CL predictions achieved external R² values of 0.39 and 0.46, respectively. To support broad accessibility and integration into early drug discovery workflows such as Design-Make-Test-Analyze (DMTA), we developed PKSmart (https://broad.io/PKSmart), a freely available web application. All code and models are also open source, enabling local deployment. To our knowledge, this represents the first public suite of PK prediction models with performance on par with industry standard models.

This study introduces the first publicly available pharmacokinetic (PK) models that match industry-standard predictions, utilizing molecular structural fingerprints, physicochemical properties, and predicted animal PK data to model human pharmacokinetics. Our approach is validated through repeated nested cross-validation and an external test set, including comparing predictions to an industry standard model. The models are released via a web-hosted application (https://broad.io/PKSmart) for wider accessibility and utility in drug development processes.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Cheminformatics CHEMISTRY, MULTIDISCIPLINARY-COMPUTER SCIENCE, INFORMATION SYSTEMS

CiteScore

14.10

自引率

7.00%

发文量

审稿时长

3 months

期刊介绍： Journal of Cheminformatics is an open access journal publishing original peer-reviewed research in all aspects of cheminformatics and molecular modelling. Coverage includes, but is not limited to: chemical information systems, software and databases, and molecular modelling, chemical structure representations and their use in structure, substructure, and similarity searching of chemical substance and chemical reaction databases, computer and molecular graphics, computer-aided molecular design, expert systems, QSAR, and data mining techniques.