LIDEB's Useful Decoys (LUDe): A freely available decoy-generation tool. Benchmarking and scope

Artificial intelligence in the life sciences Pub Date : 2025-02-07 DOI:10.1016/j.ailsci.2025.100129

Lucas N. Alberca , Denis N. Prada Gori , Maximiliano J. Fallico , Alexandre V. Fassio , Alan Talevi , Carolina L. Bellera

{"title":"LIDEB's Useful Decoys (LUDe): A freely available decoy-generation tool. Benchmarking and scope","authors":"Lucas N. Alberca , Denis N. Prada Gori , Maximiliano J. Fallico , Alexandre V. Fassio , Alan Talevi , Carolina L. Bellera","doi":"10.1016/j.ailsci.2025.100129","DOIUrl":null,"url":null,"abstract":"<div><div>In the field of chemoinformatics, and in particular, when developing models to be applied in virtual screening campaigns, it is essential to run retrospective virtual screening experiments that evaluate the performance of such models in a scenario similar to the real one. That is, the ability to recover a small number of active compounds dispersed among a much larger number of compounds without the desired activity. However, such a retrospective experiment is often limited by the relative scarcity of known inactive compounds against the pharmacological target of interest. In these cases, automatic decoy (putative inactive compound) generation tools are often of great importance. Their basic goal is to generate decoys that are similar enough to the known active compounds to challenge the models, but different enough so that the probability that the decoys modulate the molecular target of interest is small.</div><div>In this article, we report the latest version of our open-source decoy generation tool LUDe, inspired by the well-known DUD-E but designed to reduce the probability of generating decoys topologically similar to known active compounds. We have carried out a benchmarking exercise against DUD-E through 102 pharmacological targets, using the DOE score and the Doppelganger score as comparison criteria. LUDe decoys obtained better DOE scores across most of the targets, indicating a lower risk of artificial enrichment. The mean Doppelganger score, in contrast, was similar for LUDe and DUD-E decoys, exhibiting a slight improvement for LUDe decoys for most of the targets. Simulation experiments were performed to verify whether the generated decoys are unsuitable to validate ligand-based models. Our results suggest that LUDe decoys are apt to be used to validate and compare machine learning ligand-based screening approaches. Importantly, LUDe may be used locally, independently from external server availability, and is thus suitable to obtain decoys from large datasets. It is available as a Web App (at <span><span>https://lideb.biol.unlp.edu.ar/?page_id=1076</span><svg><path></path></svg></span>) and as Python code at (<span><span>https://github.com/LIDeB/LUDe.v1.0</span><svg><path></path></svg></span>)</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"7 ","pages":"Article 100129"},"PeriodicalIF":0.0000,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial intelligence in the life sciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667318525000054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In the field of chemoinformatics, and in particular, when developing models to be applied in virtual screening campaigns, it is essential to run retrospective virtual screening experiments that evaluate the performance of such models in a scenario similar to the real one. That is, the ability to recover a small number of active compounds dispersed among a much larger number of compounds without the desired activity. However, such a retrospective experiment is often limited by the relative scarcity of known inactive compounds against the pharmacological target of interest. In these cases, automatic decoy (putative inactive compound) generation tools are often of great importance. Their basic goal is to generate decoys that are similar enough to the known active compounds to challenge the models, but different enough so that the probability that the decoys modulate the molecular target of interest is small.

In this article, we report the latest version of our open-source decoy generation tool LUDe, inspired by the well-known DUD-E but designed to reduce the probability of generating decoys topologically similar to known active compounds. We have carried out a benchmarking exercise against DUD-E through 102 pharmacological targets, using the DOE score and the Doppelganger score as comparison criteria. LUDe decoys obtained better DOE scores across most of the targets, indicating a lower risk of artificial enrichment. The mean Doppelganger score, in contrast, was similar for LUDe and DUD-E decoys, exhibiting a slight improvement for LUDe decoys for most of the targets. Simulation experiments were performed to verify whether the generated decoys are unsuitable to validate ligand-based models. Our results suggest that LUDe decoys are apt to be used to validate and compare machine learning ligand-based screening approaches. Importantly, LUDe may be used locally, independently from external server availability, and is thus suitable to obtain decoys from large datasets. It is available as a Web App (at https://lideb.biol.unlp.edu.ar/?page_id=1076) and as Python code at (https://github.com/LIDeB/LUDe.v1.0)

查看原文本刊更多论文

libdeb的有用诱饵（LUDe）：一个免费的诱饵生成工具。基准和范围

在化学信息学领域，特别是在开发用于虚拟筛选活动的模型时，必须进行回顾性虚拟筛选实验，以评估此类模型在类似于真实场景中的性能。也就是说，能够恢复分散在大量没有所需活性的化合物中的少量活性化合物。然而，这种回顾性实验往往受到相对稀缺的已知非活性化合物对感兴趣的药理学目标的限制。在这些情况下，自动诱饵（假定为非活性化合物）生成工具通常非常重要。他们的基本目标是生成与已知活性化合物足够相似的诱饵来挑战模型，但又足够不同，这样诱饵调节感兴趣的分子目标的概率就很小。在这篇文章中，我们报告了我们的开源诱饵生成工具LUDe的最新版本，它的灵感来自于众所周知的DUD-E，但旨在降低生成与已知活性化合物拓扑结构相似的诱饵的概率。我们通过102个药理学靶点对DUD-E进行了基准测试，使用DOE评分和Doppelganger评分作为比较标准。LUDe诱饵在大多数目标上获得了更好的DOE分数，表明人工富集的风险较低。相比之下，LUDe和DUD-E诱饵的平均二重身得分相似，在大多数目标上，LUDe诱饵略有改善。通过仿真实验验证所生成的诱饵是否不适合验证基于配体的模型。我们的研究结果表明，LUDe诱饵易于用于验证和比较基于机器学习配体的筛选方法。重要的是，LUDe可以在本地使用，独立于外部服务器可用性，因此适合从大型数据集获取诱饵。它可以作为Web应用程序（https://lideb.biol.unlp.edu.ar/?page_id=1076）和Python代码（https://github.com/LIDeB/LUDe.v1.0）获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Artificial intelligence in the life sciences Pharmacology, Biochemistry, Genetics and Molecular Biology (General), Computer Science Applications, Health Informatics, Drug Discovery, Veterinary Science and Veterinary Medicine (General)

CiteScore

5.00

自引率

0.00%

发文量

审稿时长

15 days