在 GPU 上对一百万个分子结构进行聚类,只需几秒钟。

IF 3.4 3区 化学 Q2 CHEMISTRY, MULTIDISCIPLINARY
Junyong Gao, Mincong Wu, Jun Liao, Fanjun Meng, Changjun Chen
{"title":"在 GPU 上对一百万个分子结构进行聚类,只需几秒钟。","authors":"Junyong Gao,&nbsp;Mincong Wu,&nbsp;Jun Liao,&nbsp;Fanjun Meng,&nbsp;Changjun Chen","doi":"10.1002/jcc.27470","DOIUrl":null,"url":null,"abstract":"<p>Structure clustering is a general but time-consuming work in the study of life science. Up to now, most published tools do not support the clustering analysis on graphics processing unit (GPU) with root mean square deviation metric. In this work, we specially write codes to do the work. It supports multiple threads on multiple GPUs. To show the performance, we apply the program to a 33-residue fragment in protein Pin1 WW domain mutant. The dataset contains 1,400,000 snapshots, which are extracted from an enhanced sampling simulation and distribute widely in the conformational space. Various testing results present that our program is quite efficient. Particularly, with two NVIDIA RTX4090 GPUs and single precision data type, the clustering calculation on 1 million snapshots is completed in a few seconds (including the uploading time of data from memory to GPU and neglecting the reading time from hard disk). This is hundreds of times faster than central processing unit. Our program could be a powerful tool for fast extraction of representative states of a molecule among its thousands to millions of candidate structures.</p>","PeriodicalId":188,"journal":{"name":"Journal of Computational Chemistry","volume":"45 32","pages":"2710-2718"},"PeriodicalIF":3.4000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Clustering one million molecular structures on GPU within seconds\",\"authors\":\"Junyong Gao,&nbsp;Mincong Wu,&nbsp;Jun Liao,&nbsp;Fanjun Meng,&nbsp;Changjun Chen\",\"doi\":\"10.1002/jcc.27470\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Structure clustering is a general but time-consuming work in the study of life science. Up to now, most published tools do not support the clustering analysis on graphics processing unit (GPU) with root mean square deviation metric. In this work, we specially write codes to do the work. It supports multiple threads on multiple GPUs. To show the performance, we apply the program to a 33-residue fragment in protein Pin1 WW domain mutant. The dataset contains 1,400,000 snapshots, which are extracted from an enhanced sampling simulation and distribute widely in the conformational space. Various testing results present that our program is quite efficient. Particularly, with two NVIDIA RTX4090 GPUs and single precision data type, the clustering calculation on 1 million snapshots is completed in a few seconds (including the uploading time of data from memory to GPU and neglecting the reading time from hard disk). This is hundreds of times faster than central processing unit. Our program could be a powerful tool for fast extraction of representative states of a molecule among its thousands to millions of candidate structures.</p>\",\"PeriodicalId\":188,\"journal\":{\"name\":\"Journal of Computational Chemistry\",\"volume\":\"45 32\",\"pages\":\"2710-2718\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Computational Chemistry\",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/jcc.27470\",\"RegionNum\":3,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CHEMISTRY, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Computational Chemistry","FirstCategoryId":"92","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jcc.27470","RegionNum":3,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CHEMISTRY, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

结构聚类是生命科学研究中一项普遍但耗时的工作。迄今为止,大多数已发布的工具都不支持在图形处理器(GPU)上使用均方根偏差指标进行聚类分析。在这项工作中,我们专门编写了代码来完成这项工作。它支持多个 GPU 上的多个线程。为了展示其性能,我们将该程序应用于蛋白质 Pin1 WW 结构域突变体中的 33 个残基片段。数据集包含 1,400,000 个快照,这些快照是从增强采样模拟中提取的,广泛分布在构象空间中。各种测试结果表明,我们的程序相当高效。特别是在使用两块英伟达™(NVIDIA®)RTX4090 GPU和单精度数据类型的情况下,100 万个快照的聚类计算只需几秒钟就能完成(包括数据从内存上传到 GPU 的时间,忽略从硬盘读取数据的时间)。这比中央处理器快数百倍。我们的程序可以成为一个强大的工具,用于从数千到数百万个候选结构中快速提取分子的代表性状态。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Clustering one million molecular structures on GPU within seconds

Clustering one million molecular structures on GPU within seconds

Structure clustering is a general but time-consuming work in the study of life science. Up to now, most published tools do not support the clustering analysis on graphics processing unit (GPU) with root mean square deviation metric. In this work, we specially write codes to do the work. It supports multiple threads on multiple GPUs. To show the performance, we apply the program to a 33-residue fragment in protein Pin1 WW domain mutant. The dataset contains 1,400,000 snapshots, which are extracted from an enhanced sampling simulation and distribute widely in the conformational space. Various testing results present that our program is quite efficient. Particularly, with two NVIDIA RTX4090 GPUs and single precision data type, the clustering calculation on 1 million snapshots is completed in a few seconds (including the uploading time of data from memory to GPU and neglecting the reading time from hard disk). This is hundreds of times faster than central processing unit. Our program could be a powerful tool for fast extraction of representative states of a molecule among its thousands to millions of candidate structures.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
6.60
自引率
3.30%
发文量
247
审稿时长
1.7 months
期刊介绍: This distinguished journal publishes articles concerned with all aspects of computational chemistry: analytical, biological, inorganic, organic, physical, and materials. The Journal of Computational Chemistry presents original research, contemporary developments in theory and methodology, and state-of-the-art applications. Computational areas that are featured in the journal include ab initio and semiempirical quantum mechanics, density functional theory, molecular mechanics, molecular dynamics, statistical mechanics, cheminformatics, biomolecular structure prediction, molecular design, and bioinformatics.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信