图神经网络有望用于癌症细胞系的表型虚拟筛选。

IF 1.3 Q3 BIOCHEMICAL RESEARCH METHODS

Biology Methods and Protocols Pub Date : 2024-09-03 eCollection Date: 2024-01-01 DOI:10.1093/biomethods/bpae065

Sachin Vishwakarma, Saiveth Hernandez-Hernandez, Pedro J Ballester

{"title":"图神经网络有望用于癌症细胞系的表型虚拟筛选。","authors":"Sachin Vishwakarma, Saiveth Hernandez-Hernandez, Pedro J Ballester","doi":"10.1093/biomethods/bpae065","DOIUrl":null,"url":null,"abstract":"Artificial intelligence is increasingly driving early drug design, offering novel approaches to virtual screening. Phenotypic virtual screening (PVS) aims to predict how cancer cell lines respond to different compounds by focusing on observable characteristics rather than specific molecular targets. Some studies have suggested that deep learning may not be the best approach for PVS. However, these studies are limited by the small number of tested molecules as well as not employing suitable performance metrics and dissimilar-molecules splits better mimicking the challenging chemical diversity of real-world screening libraries. Here we prepared 60 datasets, each containing approximately 30 000-50 000 molecules tested for their growth inhibitory activities on one of the NCI-60 cancer cell lines. We conducted multiple performance evaluations of each of the five machine learning algorithms for PVS on these 60 problem instances. To provide even a more comprehensive evaluation, we used two model validation types: the random split and the dissimilar-molecules split. Overall, about 14 440 training runs aczross datasets were carried out per algorithm. The models were primarily evaluated using hit rate, a more suitable metric in VS contexts. The results show that all models are more challenged by test molecules that are substantially different from those in the training data. In both validation types, the D-MPNN algorithm, a graph-based deep neural network, was found to be the most suitable for building predictive models for this PVS problem.","PeriodicalId":36528,"journal":{"name":"Biology Methods and Protocols","volume":"9 1","pages":"bpae065"},"PeriodicalIF":1.3000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11537795/pdf/","citationCount":"0","resultStr":"{\"title\":\"Graph neural networks are promising for phenotypic virtual screening on cancer cell lines.\",\"authors\":\"Sachin Vishwakarma, Saiveth Hernandez-Hernandez, Pedro J Ballester\",\"doi\":\"10.1093/biomethods/bpae065\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Artificial intelligence is increasingly driving early drug design, offering novel approaches to virtual screening. Phenotypic virtual screening (PVS) aims to predict how cancer cell lines respond to different compounds by focusing on observable characteristics rather than specific molecular targets. Some studies have suggested that deep learning may not be the best approach for PVS. However, these studies are limited by the small number of tested molecules as well as not employing suitable performance metrics and dissimilar-molecules splits better mimicking the challenging chemical diversity of real-world screening libraries. Here we prepared 60 datasets, each containing approximately 30 000-50 000 molecules tested for their growth inhibitory activities on one of the NCI-60 cancer cell lines. We conducted multiple performance evaluations of each of the five machine learning algorithms for PVS on these 60 problem instances. To provide even a more comprehensive evaluation, we used two model validation types: the random split and the dissimilar-molecules split. Overall, about 14 440 training runs aczross datasets were carried out per algorithm. The models were primarily evaluated using hit rate, a more suitable metric in VS contexts. The results show that all models are more challenged by test molecules that are substantially different from those in the training data. In both validation types, the D-MPNN algorithm, a graph-based deep neural network, was found to be the most suitable for building predictive models for this PVS problem.\",\"PeriodicalId\":36528,\"journal\":{\"name\":\"Biology Methods and Protocols\",\"volume\":\"9 1\",\"pages\":\"bpae065\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2024-09-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11537795/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biology Methods and Protocols\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1093/biomethods/bpae065\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/1/1 0:00:00\",\"PubModel\":\"eCollection\",\"JCR\":\"Q3\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biology Methods and Protocols","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/biomethods/bpae065","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

人工智能正日益推动早期药物设计，为虚拟筛选提供了新方法。表型虚拟筛选（PVS）旨在通过关注可观察到的特征而不是特定的分子靶点，预测癌细胞系对不同化合物的反应。一些研究表明，深度学习可能不是表型虚拟筛选的最佳方法。然而，这些研究受限于测试分子的数量较少，以及没有采用合适的性能指标和异类分子分割来更好地模拟真实世界筛选库中具有挑战性的化学多样性。在这里，我们准备了 60 个数据集，每个数据集包含约 3 万-5 万个分子，测试它们对 NCI-60 癌细胞系之一的生长抑制活性。我们在这 60 个问题实例上对 PVS 的五种机器学习算法分别进行了多次性能评估。为了提供更全面的评估，我们使用了两种模型验证类型：随机拆分和异类分子拆分。总体而言，每种算法在不同数据集上进行了约 14 440 次训练运行。模型主要使用命中率进行评估，命中率是 VS 环境中更合适的指标。结果表明，所有模型在测试分子与训练数据中的分子有很大差异时都会面临更大的挑战。在这两种验证类型中，D-MPNN 算法（一种基于图的深度神经网络）被认为是最适合为这一 PVS 问题建立预测模型的算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Graph neural networks are promising for phenotypic virtual screening on cancer cell lines.

查看原文本刊更多论文

Graph neural networks are promising for phenotypic virtual screening on cancer cell lines.

Artificial intelligence is increasingly driving early drug design, offering novel approaches to virtual screening. Phenotypic virtual screening (PVS) aims to predict how cancer cell lines respond to different compounds by focusing on observable characteristics rather than specific molecular targets. Some studies have suggested that deep learning may not be the best approach for PVS. However, these studies are limited by the small number of tested molecules as well as not employing suitable performance metrics and dissimilar-molecules splits better mimicking the challenging chemical diversity of real-world screening libraries. Here we prepared 60 datasets, each containing approximately 30 000-50 000 molecules tested for their growth inhibitory activities on one of the NCI-60 cancer cell lines. We conducted multiple performance evaluations of each of the five machine learning algorithms for PVS on these 60 problem instances. To provide even a more comprehensive evaluation, we used two model validation types: the random split and the dissimilar-molecules split. Overall, about 14 440 training runs aczross datasets were carried out per algorithm. The models were primarily evaluated using hit rate, a more suitable metric in VS contexts. The results show that all models are more challenged by test molecules that are substantially different from those in the training data. In both validation types, the D-MPNN algorithm, a graph-based deep neural network, was found to be the most suitable for building predictive models for this PVS problem.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Biology Methods and Protocols Agricultural and Biological Sciences-Agricultural and Biological Sciences (all)

CiteScore

3.80

自引率

2.80%

发文量

审稿时长

19 weeks