Yin-yang in drug discovery: rethinking de novo design and development of predictive models

Ana L. Chávez‐Hernández, E. López-López, J. Medina‐Franco
{"title":"Yin-yang in drug discovery: rethinking de novo design and development of predictive models","authors":"Ana L. Chávez‐Hernández, E. López-López, J. Medina‐Franco","doi":"10.3389/fddsv.2023.1222655","DOIUrl":null,"url":null,"abstract":"Chemical and biological data are the cornerstone of modern drug discovery programs. Finding qualitative yet better quantitative relationships between chemical structures and biological activity has been long pursued in medicinal chemistry and drug discovery. With the rapid increase and deployment of the predictive machine and deep learning methods, as well as the renewed interest in the de novo design of compound libraries to enlarge the medicinally relevant chemical space, the balance between quantity and quality of data are becoming a central point in the discussion of the type of data sets needed. Although there is a general notion that the more data, the better, it is also true that its quality is crucial despite the size of the data itself. Furthermore, the active versus inactive compounds ratio balance is also a major consideration. This review discusses the most common public data sets currently used as benchmarks to develop predictive and classification models used in de novo design. We point out the need to continue disclosing inactive compounds and negative data in peer-reviewed publications and public repositories and promote the balance between the positive (Yang) and negative (Yin) bioactivity data. We emphasize the importance of reconsidering drug discovery initiatives regarding both the utilization and classification of data.","PeriodicalId":73080,"journal":{"name":"Frontiers in drug discovery","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Frontiers in drug discovery","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3389/fddsv.2023.1222655","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Chemical and biological data are the cornerstone of modern drug discovery programs. Finding qualitative yet better quantitative relationships between chemical structures and biological activity has been long pursued in medicinal chemistry and drug discovery. With the rapid increase and deployment of the predictive machine and deep learning methods, as well as the renewed interest in the de novo design of compound libraries to enlarge the medicinally relevant chemical space, the balance between quantity and quality of data are becoming a central point in the discussion of the type of data sets needed. Although there is a general notion that the more data, the better, it is also true that its quality is crucial despite the size of the data itself. Furthermore, the active versus inactive compounds ratio balance is also a major consideration. This review discusses the most common public data sets currently used as benchmarks to develop predictive and classification models used in de novo design. We point out the need to continue disclosing inactive compounds and negative data in peer-reviewed publications and public repositories and promote the balance between the positive (Yang) and negative (Yin) bioactivity data. We emphasize the importance of reconsidering drug discovery initiatives regarding both the utilization and classification of data.
药物发现中的阴阳:重新思考预测模型的从头设计和开发
化学和生物数据是现代药物发现计划的基石。在药物化学和药物发现中,寻找化学结构和生物活性之间的定性和更好的定量关系一直是人们追求的目标。随着预测机器和深度学习方法的快速增加和部署,以及对化合物库重新设计的兴趣,以扩大与医学相关的化学空间,数据数量和质量之间的平衡正在成为讨论所需数据集类型的中心点。虽然人们普遍认为数据越多越好,但数据的质量也至关重要,尽管数据本身有多大。此外,活性与非活性化合物的比例平衡也是一个主要考虑因素。这篇综述讨论了目前最常见的公共数据集,这些数据集被用作开发用于从头设计的预测和分类模型的基准。我们指出有必要继续在同行评审出版物和公共数据库中披露非活性化合物和负面数据,并促进正面(阳)和负面(阴)生物活性数据之间的平衡。我们强调在数据利用和分类方面重新考虑药物发现倡议的重要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信