Traversing chemical space with active deep learning for low-data drug discovery

IF 12 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Derek van Tilborg, Francesca Grisoni
{"title":"Traversing chemical space with active deep learning for low-data drug discovery","authors":"Derek van Tilborg, Francesca Grisoni","doi":"10.1038/s43588-024-00697-2","DOIUrl":null,"url":null,"abstract":"Deep learning is accelerating drug discovery. However, current approaches are often affected by limitations in the available data, in terms of either size or molecular diversity. Active deep learning has high potential for low-data drug discovery, as it allows iterative model improvement during the screening process. However, there are several ‘known unknowns’ that limit the wider adoption of active deep learning in drug discovery: (1) what the best computational strategies are for chemical space exploration, (2) how active learning holds up to traditional, non-iterative, approaches and (3) how it should be used in the low-data scenarios typical of drug discovery. To provide answers, this study simulates a low-data drug discovery scenario, and systematically analyzes six active learning strategies combined with two deep learning architectures, on three large-scale molecular libraries. We identify the most important determinants of success in low-data regimes and show that active learning can achieve up to a sixfold improvement in hit discovery when compared with traditional screening methods. Active deep learning is a promising approach to learn from low-data scenarios in drug discovery. This study illuminates key success factors of active learning and shows that it can boost hit discovery by up to sixfold over traditional methods.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 10","pages":"786-796"},"PeriodicalIF":12.0000,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature computational science","FirstCategoryId":"1085","ListUrlMain":"https://www.nature.com/articles/s43588-024-00697-2","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Deep learning is accelerating drug discovery. However, current approaches are often affected by limitations in the available data, in terms of either size or molecular diversity. Active deep learning has high potential for low-data drug discovery, as it allows iterative model improvement during the screening process. However, there are several ‘known unknowns’ that limit the wider adoption of active deep learning in drug discovery: (1) what the best computational strategies are for chemical space exploration, (2) how active learning holds up to traditional, non-iterative, approaches and (3) how it should be used in the low-data scenarios typical of drug discovery. To provide answers, this study simulates a low-data drug discovery scenario, and systematically analyzes six active learning strategies combined with two deep learning architectures, on three large-scale molecular libraries. We identify the most important determinants of success in low-data regimes and show that active learning can achieve up to a sixfold improvement in hit discovery when compared with traditional screening methods. Active deep learning is a promising approach to learn from low-data scenarios in drug discovery. This study illuminates key success factors of active learning and shows that it can boost hit discovery by up to sixfold over traditional methods.

Abstract Image

利用主动深度学习穿越化学空间,实现低数据药物发现。
深度学习正在加速药物发现。然而,目前的方法往往受到可用数据规模或分子多样性的限制。主动深度学习在低数据药物发现方面具有很大潜力,因为它允许在筛选过程中迭代改进模型。然而,有几个 "已知的未知数 "限制了主动深度学习在药物发现中的广泛应用:(1)化学空间探索的最佳计算策略是什么;(2)主动学习与传统的非迭代方法相比有何优势;(3)在药物发现的典型低数据场景中应如何使用主动学习。为了提供答案,本研究模拟了低数据药物发现场景,并在三个大规模分子库上系统分析了六种主动学习策略与两种深度学习架构的结合。我们确定了在低数据环境中取得成功的最重要决定因素,并表明与传统筛选方法相比,主动学习可以在发现新药方面实现高达六倍的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
11.70
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信