应用人工神经网络实现分子动力学模拟轨迹的优化存储和便捷共享。

IF 5.3 2区 化学 Q1 CHEMISTRY, MEDICINAL
Abdul Wasim*, Lars V. Schäfer* and Jagannath Mondal*, 
{"title":"应用人工神经网络实现分子动力学模拟轨迹的优化存储和便捷共享。","authors":"Abdul Wasim*,&nbsp;Lars V. Schäfer* and Jagannath Mondal*,&nbsp;","doi":"10.1021/acs.jcim.5c01294","DOIUrl":null,"url":null,"abstract":"<p >With the remarkable stride in computing power and advances in Molecular Dynamics (MD) simulation programs, the crucial challenge of storing and sharing large biomolecular simulation data sets has emerged. By leveraging AutoEncoders, a type of artificial neural network, we developed a method to compress MD trajectories into significantly smaller latent spaces. Our method can save up to 98% in disk space compared to <span>xtc</span>, a highly compressed trajectory format from the widely used MD program package GROMACS, thus facilitating storage and sharing of simulation trajectories. Atom coordinates are very accurately reconstructed from compressed data. The method was tested across a diverse sets of biomolecular systems, including folded proteins, intrinsically disordered proteins, phospholipid bilayers, protein–ligand complexes, large protein complexes and membrane-bound protein systems. The reconstructed trajectories demonstrated consistent accuracy in recovering key biophysically relevant properties for proteins, lipids and composite systems. The compression efficiency was particularly beneficial for larger systems. This approach enables the scientific community to efficiently store and share large-scale biomolecular simulation data, potentially enhancing collaborative research efforts. The workflow, termed “compresstraj”, is implemented in PyTorch and is publicly available at https://github.com/SerpentByte/compresstraj, offering a practical solution for handling the increasing volumes of data generated in biomolecular simulation studies.</p>","PeriodicalId":44,"journal":{"name":"Journal of Chemical Information and Modeling ","volume":"65 17","pages":"9022–9033"},"PeriodicalIF":5.3000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Employing Artificial Neural Networks for Optimal Storage and Facile Sharing of Molecular Dynamics Simulation Trajectories\",\"authors\":\"Abdul Wasim*,&nbsp;Lars V. Schäfer* and Jagannath Mondal*,&nbsp;\",\"doi\":\"10.1021/acs.jcim.5c01294\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p >With the remarkable stride in computing power and advances in Molecular Dynamics (MD) simulation programs, the crucial challenge of storing and sharing large biomolecular simulation data sets has emerged. By leveraging AutoEncoders, a type of artificial neural network, we developed a method to compress MD trajectories into significantly smaller latent spaces. Our method can save up to 98% in disk space compared to <span>xtc</span>, a highly compressed trajectory format from the widely used MD program package GROMACS, thus facilitating storage and sharing of simulation trajectories. Atom coordinates are very accurately reconstructed from compressed data. The method was tested across a diverse sets of biomolecular systems, including folded proteins, intrinsically disordered proteins, phospholipid bilayers, protein–ligand complexes, large protein complexes and membrane-bound protein systems. The reconstructed trajectories demonstrated consistent accuracy in recovering key biophysically relevant properties for proteins, lipids and composite systems. The compression efficiency was particularly beneficial for larger systems. This approach enables the scientific community to efficiently store and share large-scale biomolecular simulation data, potentially enhancing collaborative research efforts. The workflow, termed “compresstraj”, is implemented in PyTorch and is publicly available at https://github.com/SerpentByte/compresstraj, offering a practical solution for handling the increasing volumes of data generated in biomolecular simulation studies.</p>\",\"PeriodicalId\":44,\"journal\":{\"name\":\"Journal of Chemical Information and Modeling \",\"volume\":\"65 17\",\"pages\":\"9022–9033\"},\"PeriodicalIF\":5.3000,\"publicationDate\":\"2025-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Chemical Information and Modeling \",\"FirstCategoryId\":\"92\",\"ListUrlMain\":\"https://pubs.acs.org/doi/10.1021/acs.jcim.5c01294\",\"RegionNum\":2,\"RegionCategory\":\"化学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"CHEMISTRY, MEDICINAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Chemical Information and Modeling ","FirstCategoryId":"92","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.jcim.5c01294","RegionNum":2,"RegionCategory":"化学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
引用次数: 0

摘要

随着计算能力的显著进步和分子动力学(MD)模拟程序的进步,存储和共享大型生物分子模拟数据集的关键挑战已经出现。通过利用AutoEncoders(一种人工神经网络),我们开发了一种将MD轨迹压缩到更小的潜在空间的方法。与xtc(一种来自广泛使用的MD程序包GROMACS的高度压缩的轨迹格式)相比,我们的方法可以节省高达98%的磁盘空间,从而促进了仿真轨迹的存储和共享。从压缩数据非常精确地重建原子坐标。该方法在多种生物分子系统中进行了测试,包括折叠蛋白质、内在无序蛋白质、磷脂双层、蛋白质配体复合物、大蛋白质复合物和膜结合蛋白质系统。重建的轨迹在恢复蛋白质、脂质和复合系统的关键生物物理相关特性方面表现出一致的准确性。压缩效率特别有利于大型系统。这种方法使科学界能够有效地存储和共享大规模生物分子模拟数据,潜在地加强合作研究工作。该工作流称为“compresstraj”,在PyTorch中实现,并可在https://github.com/SerpentByte/compresstraj上公开获取,为处理生物分子模拟研究中生成的越来越多的数据提供了一个实用的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Employing Artificial Neural Networks for Optimal Storage and Facile Sharing of Molecular Dynamics Simulation Trajectories

Employing Artificial Neural Networks for Optimal Storage and Facile Sharing of Molecular Dynamics Simulation Trajectories

With the remarkable stride in computing power and advances in Molecular Dynamics (MD) simulation programs, the crucial challenge of storing and sharing large biomolecular simulation data sets has emerged. By leveraging AutoEncoders, a type of artificial neural network, we developed a method to compress MD trajectories into significantly smaller latent spaces. Our method can save up to 98% in disk space compared to xtc, a highly compressed trajectory format from the widely used MD program package GROMACS, thus facilitating storage and sharing of simulation trajectories. Atom coordinates are very accurately reconstructed from compressed data. The method was tested across a diverse sets of biomolecular systems, including folded proteins, intrinsically disordered proteins, phospholipid bilayers, protein–ligand complexes, large protein complexes and membrane-bound protein systems. The reconstructed trajectories demonstrated consistent accuracy in recovering key biophysically relevant properties for proteins, lipids and composite systems. The compression efficiency was particularly beneficial for larger systems. This approach enables the scientific community to efficiently store and share large-scale biomolecular simulation data, potentially enhancing collaborative research efforts. The workflow, termed “compresstraj”, is implemented in PyTorch and is publicly available at https://github.com/SerpentByte/compresstraj, offering a practical solution for handling the increasing volumes of data generated in biomolecular simulation studies.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
9.80
自引率
10.70%
发文量
529
审稿时长
1.4 months
期刊介绍: The Journal of Chemical Information and Modeling publishes papers reporting new methodology and/or important applications in the fields of chemical informatics and molecular modeling. Specific topics include the representation and computer-based searching of chemical databases, molecular modeling, computer-aided molecular design of new materials, catalysts, or ligands, development of new computational methods or efficient algorithms for chemical software, and biopharmaceutical chemistry including analyses of biological activity and other issues related to drug discovery. Astute chemists, computer scientists, and information specialists look to this monthly’s insightful research studies, programming innovations, and software reviews to keep current with advances in this integral, multidisciplinary field. As a subscriber you’ll stay abreast of database search systems, use of graph theory in chemical problems, substructure search systems, pattern recognition and clustering, analysis of chemical and physical data, molecular modeling, graphics and natural language interfaces, bibliometric and citation analysis, and synthesis design and reactions databases.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信