Nature computational science最新文献

筛选
英文 中文
Enabling efficient analysis of biobank-scale data with genotype representation graphs 通过基因型表示图实现生物库规模数据的有效分析。
IF 12
Nature computational science Pub Date : 2024-12-05 DOI: 10.1038/s43588-024-00739-9
Drew DeHaas, Ziqing Pan, Xinzhu Wei
{"title":"Enabling efficient analysis of biobank-scale data with genotype representation graphs","authors":"Drew DeHaas, Ziqing Pan, Xinzhu Wei","doi":"10.1038/s43588-024-00739-9","DOIUrl":"10.1038/s43588-024-00739-9","url":null,"abstract":"Computational analysis of a large number of genomes requires a data structure that can represent the dataset compactly while also enabling efficient operations on variants and samples. However, encoding genetic data in existing tabular data structures and file formats has become costly and unsustainable. Here we introduce the genotype representation graph (GRG), a fully connected hierarchical data structure that losslessly encodes phased whole-genome polymorphisms. Exploiting variant-sharing across samples enables GRG to compress 200,000 UK Biobank phased human genomes to 5–26 gigabytes per chromosome, also enabling graph-traversal algorithms to reuse computed values in random access memory. Constructing and processing GRG files scales to a million whole genomes. Using allele frequencies and association effects as examples, we show that computation on GRG via graph traversal runs the fastest among all tested alternatives. GRG-based algorithms have the potential to increase the scalability and reduce the cost of analyzing large genomic datasets. The genotype representation graph (GRG) is a compact data structure that encodes 200,000 human genomes in just 5–26 gigabytes per chromosome. Computation on GRG via graph traversal greatly accelerates genome-wide analysis.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 2","pages":"112-124"},"PeriodicalIF":12.0,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142788035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Teaching spin symmetry while learning neural network wave functions 在学习神经网络波函数的同时教授自旋对称。
IF 12
Nature computational science Pub Date : 2024-12-04 DOI: 10.1038/s43588-024-00727-z
Yongle Li, Yuhao Chen, Xiao He
{"title":"Teaching spin symmetry while learning neural network wave functions","authors":"Yongle Li, Yuhao Chen, Xiao He","doi":"10.1038/s43588-024-00727-z","DOIUrl":"10.1038/s43588-024-00727-z","url":null,"abstract":"By developing an efficient spin symmetry penalty, a recent study has substantially accelerated the calculation of accurate energies with correct spin states in variational Monte Carlo for both ground and excited states of quantum many-particle systems.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 12","pages":"884-885"},"PeriodicalIF":12.0,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142782029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning training dynamics analysis for single-cell data 单细胞数据的深度学习训练动态分析。
IF 12
Nature computational science Pub Date : 2024-12-04 DOI: 10.1038/s43588-024-00728-y
{"title":"Deep learning training dynamics analysis for single-cell data","authors":"","doi":"10.1038/s43588-024-00728-y","DOIUrl":"10.1038/s43588-024-00728-y","url":null,"abstract":"Inspired by recent approaches for natural language processing and computer vision, we developed Annotatability, a framework that analyzes deep neural network training dynamics to interpret pre-annotated single-cell and spatial omics data. Annotatability identified erroneous annotations and ambiguous cell states, inferred trajectories from binary labels, and revealed underlying biological signals.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 12","pages":"886-887"},"PeriodicalIF":12.0,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142782026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spin-symmetry-enforced solution of the many-body Schrödinger equation with a deep neural network 用深度神经网络求解多体Schrödinger方程的自旋对称强制解。
IF 12
Nature computational science Pub Date : 2024-12-04 DOI: 10.1038/s43588-024-00730-4
Zhe Li, Zixiang Lu, Ruichen Li, Xuelan Wen, Xiang Li, Liwei Wang, Ji Chen, Weiluo Ren
{"title":"Spin-symmetry-enforced solution of the many-body Schrödinger equation with a deep neural network","authors":"Zhe Li, Zixiang Lu, Ruichen Li, Xuelan Wen, Xiang Li, Liwei Wang, Ji Chen, Weiluo Ren","doi":"10.1038/s43588-024-00730-4","DOIUrl":"10.1038/s43588-024-00730-4","url":null,"abstract":"The integration of deep neural networks with the variational Monte Carlo (VMC) method has marked a substantial advancement in solving the Schrödinger equation. In this work we enforce spin symmetry in the neural-network-based VMC calculation using a modified optimization target. Our method is designed to solve for the ground state and multiple excited states with target spin symmetry at a low computational cost. It predicts accurate energies while maintaining the correct symmetry in strongly correlated systems, even in cases in which different spin states are nearly degenerate. Our approach also excels at spin–gap calculations, including the singlet–triplet gap in biradical systems, which is of high interest in photochemistry. Overall, this work establishes a robust framework for efficiently calculating various quantum states with specific spin symmetry in correlated systems. An efficient approach is developed to enforce spin symmetry for neural network wavefunctions when solving the many-body Schrödinger equation. This enables accurate and spin-pure simulations of both ground and excited states.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 12","pages":"910-919"},"PeriodicalIF":12.0,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142782028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interpreting single-cell and spatial omics data using deep neural network training dynamics 利用深度神经网络训练动力学解释单细胞和空间组学数据。
IF 12
Nature computational science Pub Date : 2024-12-04 DOI: 10.1038/s43588-024-00721-5
Jonathan Karin, Reshef Mintz, Barak Raveh, Mor Nitzan
{"title":"Interpreting single-cell and spatial omics data using deep neural network training dynamics","authors":"Jonathan Karin, Reshef Mintz, Barak Raveh, Mor Nitzan","doi":"10.1038/s43588-024-00721-5","DOIUrl":"10.1038/s43588-024-00721-5","url":null,"abstract":"Single-cell and spatial omics datasets can be organized and interpreted by annotating single cells to distinct types, states, locations or phenotypes. However, cell annotations are inherently ambiguous, as discrete labels with subjective interpretations are assigned to heterogeneous cell populations on the basis of noisy, sparse and high-dimensional data. Here we developed Annotatability, a framework for identifying annotation mismatches and characterizing biological data structure by monitoring the dynamics and difficulty of training a deep neural network over such annotated data. Following this, we developed a signal-aware graph embedding method that enables downstream analysis of biological signals. This embedding captures cellular communities associated with target signals. Using Annotatability, we address key challenges in the interpretation of genomic data, demonstrated over eight single-cell RNA sequencing and spatial omics datasets, including identifying erroneous annotations and intermediate cell states, delineating developmental or disease trajectories, and capturing cellular heterogeneity. These results underscore the broad applicability of annotation-trainability analysis via Annotatability for unraveling cellular diversity and interpreting collective cell behaviors in health and disease. The Annotatability framework analyzes neural network training dynamics to interpret single-cell and spatial omics data. It identifies erroneous annotations and ambiguous cell states, infers trajectories from binary labels and enables signal-aware analysis.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 12","pages":"941-954"},"PeriodicalIF":12.0,"publicationDate":"2024-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s43588-024-00721-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142782027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comprehensive prediction and analysis of human protein essentiality based on a pretrained large language model. 基于预训练大型语言模型的人类蛋白质本质综合预测与分析。
IF 12
Nature computational science Pub Date : 2024-11-27 DOI: 10.1038/s43588-024-00733-1
Boming Kang, Rui Fan, Chunmei Cui, Qinghua Cui
{"title":"Comprehensive prediction and analysis of human protein essentiality based on a pretrained large language model.","authors":"Boming Kang, Rui Fan, Chunmei Cui, Qinghua Cui","doi":"10.1038/s43588-024-00733-1","DOIUrl":"https://doi.org/10.1038/s43588-024-00733-1","url":null,"abstract":"<p><p>Human essential proteins (HEPs) are indispensable for individual viability and development. However, experimental methods to identify HEPs are often costly, time consuming and labor intensive. In addition, existing computational methods predict HEPs only at the cell line level, but HEPs vary across living human, cell line and animal models. Here we develop a sequence-based deep learning model, Protein Importance Calculator (PIC), by fine-tuning a pretrained protein language model. PIC not only substantially outperforms existing methods for predicting HEPs but also provides comprehensive prediction results across three levels: human, cell line and mouse. Furthermore, we define the protein essential score, derived from PIC, to quantify human protein essentiality and validate its effectiveness by a series of biological analyses. We also demonstrate the biomedical value of the protein essential score by identifying potential prognostic biomarkers for breast cancer and quantifying the essentiality of 617,462 human microproteins.</p>","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":" ","pages":""},"PeriodicalIF":12.0,"publicationDate":"2024-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142741716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Harnessing the power of DNA for computing 利用 DNA 的力量进行计算
IF 12
Nature computational science Pub Date : 2024-11-21 DOI: 10.1038/s43588-024-00742-0
{"title":"Harnessing the power of DNA for computing","authors":"","doi":"10.1038/s43588-024-00742-0","DOIUrl":"10.1038/s43588-024-00742-0","url":null,"abstract":"We discuss the thirty-year anniversary of the seminal work on DNA computing and its implications for the field of biotechnology.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 11","pages":"801-801"},"PeriodicalIF":12.0,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s43588-024-00742-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142679987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Collective deliberation driven by AI 由人工智能驱动的集体审议。
IF 12
Nature computational science Pub Date : 2024-11-18 DOI: 10.1038/s43588-024-00736-y
Fernando Chirigati
{"title":"Collective deliberation driven by AI","authors":"Fernando Chirigati","doi":"10.1038/s43588-024-00736-y","DOIUrl":"10.1038/s43588-024-00736-y","url":null,"abstract":"","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 11","pages":"802-802"},"PeriodicalIF":12.0,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142669981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Harnessing deep learning to build optimized ligands 利用深度学习构建优化配体。
IF 12
Nature computational science Pub Date : 2024-11-14 DOI: 10.1038/s43588-024-00725-1
Orestis A. Ntintas, Theodoros Daglis, Vassilis G. Gorgoulis
{"title":"Harnessing deep learning to build optimized ligands","authors":"Orestis A. Ntintas,&nbsp;Theodoros Daglis,&nbsp;Vassilis G. Gorgoulis","doi":"10.1038/s43588-024-00725-1","DOIUrl":"10.1038/s43588-024-00725-1","url":null,"abstract":"A recent study proposes DeepBlock, a deep learning-based approach for generating ligands with targeted properties, such as low toxicity and high affinity with the given target. This approach outperforms existing methods in the field while maintaining synthetic accessibility and drug-likeness.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 11","pages":"809-810"},"PeriodicalIF":12.0,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142634043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MassiveFold: unveiling AlphaFold’s hidden potential with optimized and parallelized massive sampling MassiveFold:通过优化和并行化的大规模采样挖掘 AlphaFold 隐藏的潜力。
IF 12
Nature computational science Pub Date : 2024-11-11 DOI: 10.1038/s43588-024-00714-4
Nessim Raouraoua, Claudio Mirabello, Thibaut Véry, Christophe Blanchet, Björn Wallner, Marc F. Lensink, Guillaume Brysbaert
{"title":"MassiveFold: unveiling AlphaFold’s hidden potential with optimized and parallelized massive sampling","authors":"Nessim Raouraoua,&nbsp;Claudio Mirabello,&nbsp;Thibaut Véry,&nbsp;Christophe Blanchet,&nbsp;Björn Wallner,&nbsp;Marc F. Lensink,&nbsp;Guillaume Brysbaert","doi":"10.1038/s43588-024-00714-4","DOIUrl":"10.1038/s43588-024-00714-4","url":null,"abstract":"Massive sampling in AlphaFold enables access to increased structural diversity. In combination with its efficient confidence ranking, this unlocks elevated modeling capabilities for monomeric structures and foremost for protein assemblies. However, the approach struggles with GPU cost and data storage. Here we introduce MassiveFold, an optimized and customizable version of AlphaFold that runs predictions in parallel, reducing the computing time from several months to hours. MassiveFold is scalable and able to run on anything from a single computer to a large GPU infrastructure, where it can fully benefit from all the computing nodes. Although AlphaFold is very efficient for protein structure prediction, massive sampling is a very GPU demanding task. MassiveFold overcomes this limitation, being capable of parallelizing structure prediction computation.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"4 11","pages":"824-828"},"PeriodicalIF":12.0,"publicationDate":"2024-11-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s43588-024-00714-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142634045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信