Physicochemical QSAR analysis of hERG inhibition revisited: towards a quantitative potency prediction

IF 3 3区 生物学 Q3 BIOCHEMISTRY & MOLECULAR BIOLOGY
Kiril Lanevskij, Remigijus Didziapetris, Andrius Sazonovas
{"title":"Physicochemical QSAR analysis of hERG inhibition revisited: towards a quantitative potency prediction","authors":"Kiril Lanevskij,&nbsp;Remigijus Didziapetris,&nbsp;Andrius Sazonovas","doi":"10.1007/s10822-022-00483-0","DOIUrl":null,"url":null,"abstract":"<div><p>In an earlier study (Didziapetris R &amp; Lanevskij K (2016). J Comput Aided Mol Des. 30:1175–1188) we collected a database of publicly available hERG inhibition data for almost 6700 drug-like molecules and built a probabilistic Gradient Boosting classifier with a minimal set of physicochemical descriptors (log <i>P</i>, p<i>K</i><sub>a</sub>, molecular size and topology parameters). This approach favored interpretability over statistical performance but still achieved an overall classification accuracy of 75%. In the current follow-up work we expanded the database (provided in Supplementary Information) to almost 9400 molecules and performed temporal validation of the model on a set of novel chemicals from recently published lead optimization projects. Validation results showed almost no performance degradation compared to the original study. Additionally, we rebuilt the model using AFT (Accelerated Failure Time) learning objective in XGBoost, which accepts both quantitative and censored data often reported in protein inhibition studies. The new model achieved a similar level of accuracy of discerning hERG blockers from non-blockers at 10 µM threshold, which can be conceived as close to the performance ceiling for methods aiming to describe only non-specific ligand interactions with hERG. Yet, this model outputs quantitative potency values (<i>IC</i><sub>50</sub>) and is not tied to a particular classification cut-off. p<i>IC</i><sub>50</sub> from patch-clamp measurements can be predicted with R<sup>2</sup> ≈ 0.4 and MAE &lt; 0.5, which enables ligand ranking according to their expected potency levels. The employed approach can be valuable for quantitative modeling of various ADME and drug safety endpoints with a high prevalence of censored data.</p></div>","PeriodicalId":621,"journal":{"name":"Journal of Computer-Aided Molecular Design","volume":"36 12","pages":"837 - 849"},"PeriodicalIF":3.0000,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10822-022-00483-0.pdf","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computer-Aided Molecular Design","FirstCategoryId":"99","ListUrlMain":"https://link.springer.com/article/10.1007/s10822-022-00483-0","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 5

Abstract

In an earlier study (Didziapetris R & Lanevskij K (2016). J Comput Aided Mol Des. 30:1175–1188) we collected a database of publicly available hERG inhibition data for almost 6700 drug-like molecules and built a probabilistic Gradient Boosting classifier with a minimal set of physicochemical descriptors (log P, pKa, molecular size and topology parameters). This approach favored interpretability over statistical performance but still achieved an overall classification accuracy of 75%. In the current follow-up work we expanded the database (provided in Supplementary Information) to almost 9400 molecules and performed temporal validation of the model on a set of novel chemicals from recently published lead optimization projects. Validation results showed almost no performance degradation compared to the original study. Additionally, we rebuilt the model using AFT (Accelerated Failure Time) learning objective in XGBoost, which accepts both quantitative and censored data often reported in protein inhibition studies. The new model achieved a similar level of accuracy of discerning hERG blockers from non-blockers at 10 µM threshold, which can be conceived as close to the performance ceiling for methods aiming to describe only non-specific ligand interactions with hERG. Yet, this model outputs quantitative potency values (IC50) and is not tied to a particular classification cut-off. pIC50 from patch-clamp measurements can be predicted with R2 ≈ 0.4 and MAE < 0.5, which enables ligand ranking according to their expected potency levels. The employed approach can be valuable for quantitative modeling of various ADME and drug safety endpoints with a high prevalence of censored data.

Abstract Image

hERG抑制的理化QSAR分析再访:走向定量效价预测
在早期的一项研究中(Didziapetris R &Lanevskij K(2016)。我们收集了近6700种药物类分子的公开可用hERG抑制数据数据库,并使用最小的物理化学描述符(log P, pKa,分子大小和拓扑参数)构建了概率梯度增强分类器。这种方法更倾向于可解释性而不是统计性能,但仍然实现了75%的总体分类精度。在目前的后续工作中,我们将数据库(在补充信息中提供)扩展到近9400个分子,并在最近发表的先导优化项目的一组新化学物质上对该模型进行了时间验证。验证结果显示,与原始研究相比,几乎没有性能下降。此外,我们使用XGBoost中的AFT(加速失效时间)学习目标重建了模型,该目标接受定量和审查数据,通常在蛋白质抑制研究中报道。新模型在10 μ M阈值下实现了hERG阻滞剂和非阻滞剂的相似准确度,这可以被认为接近于仅描述与hERG非特异性配体相互作用的方法的性能上限。然而,该模型输出定量效价值(IC50),并且不依赖于特定的分类截止值。膜片钳测量的pIC50可以用R2≈0.4和MAE < 0.5来预测,这使得配体能够根据其预期的效价水平进行排序。所采用的方法对于各种ADME和药物安全端点的定量建模具有很高的审查数据的流行率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Computer-Aided Molecular Design
Journal of Computer-Aided Molecular Design 生物-计算机:跨学科应用
CiteScore
8.00
自引率
8.60%
发文量
56
审稿时长
3 months
期刊介绍: The Journal of Computer-Aided Molecular Design provides a form for disseminating information on both the theory and the application of computer-based methods in the analysis and design of molecules. The scope of the journal encompasses papers which report new and original research and applications in the following areas: - theoretical chemistry; - computational chemistry; - computer and molecular graphics; - molecular modeling; - protein engineering; - drug design; - expert systems; - general structure-property relationships; - molecular dynamics; - chemical database development and usage.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信