{"title":"Based on the science, diversity matters","authors":"","doi":"10.1038/s43588-025-00778-w","DOIUrl":"10.1038/s43588-025-00778-w","url":null,"abstract":"We reflect on what science tells us about the importance of diversity.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 2","pages":"91-91"},"PeriodicalIF":12.0,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s43588-025-00778-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143461002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhancing synthesis prediction via machine learning","authors":"J. C. Schön","doi":"10.1038/s43588-025-00771-3","DOIUrl":"10.1038/s43588-025-00771-3","url":null,"abstract":"Identifying promising synthesis targets and designing routes to their synthesis is a grand challenge in chemistry and materials science. Recent work employing machine learning in combination with traditional approaches is opening new ways to address this truly Herculean task.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 2","pages":"95-96"},"PeriodicalIF":12.0,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143400963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ning Liu, Siavash Jafarzadeh, Brian Y Lattimer, Shuna Ni, Jim Lua, Yue Yu
{"title":"Harnessing large language models for data-scarce learning of polymer properties.","authors":"Ning Liu, Siavash Jafarzadeh, Brian Y Lattimer, Shuna Ni, Jim Lua, Yue Yu","doi":"10.1038/s43588-025-00768-y","DOIUrl":"https://doi.org/10.1038/s43588-025-00768-y","url":null,"abstract":"<p><p>Large language models (LLMs) bear promise as a fast and accurate material modeling paradigm for evaluation, analysis and design. Their vast number of trainable parameters necessitates a wealth of data to achieve accuracy and mitigate overfitting. However, experimental measurements are often limited and costly to obtain in sufficient quantities for fine-tuning. To this end, here we present a physics-based training pipeline that tackles the pathology of data scarcity. The core enabler is a physics-based modeling framework that generates a multitude of synthetic data to align the LLM to a physically consistent initial state before fine-tuning. Our framework features a two-phase training strategy: utilizing the large-in-amount but less accurate synthetic data for supervised pretraining, and fine-tuning the phase-1 model with limited experimental data. We empirically demonstrate that supervised pretraining is vital to obtaining accurate fine-tuned LLMs, via the lens of learning polymer flammability metrics where cone calorimeter data are sparse.</p>","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":" ","pages":""},"PeriodicalIF":12.0,"publicationDate":"2025-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143392635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xihao Li, Han Chen, Margaret Sunitha Selvaraj, Eric Van Buren, Hufeng Zhou, Yuxuan Wang, Ryan Sun, Zachary R. McCaw, Zhi Yu, Min-Zhi Jiang, Daniel DiCorpo, Sheila M. Gaynor, Rounak Dey, Donna K. Arnett, Emelia J. Benjamin, Joshua C. Bis, John Blangero, Eric Boerwinkle, Donald W. Bowden, Jennifer A. Brody, Brian E. Cade, April P. Carson, Jenna C. Carlson, Nathalie Chami, Yii-Der Ida Chen, Joanne E. Curran, Paul S. de Vries, Myriam Fornage, Nora Franceschini, Barry I. Freedman, Charles Gu, Nancy L. Heard-Costa, Jiang He, Lifang Hou, Yi-Jen Hung, Marguerite R. Irvin, Robert C. Kaplan, Sharon L. R. Kardia, Tanika N. Kelly, Iain Konigsberg, Charles Kooperberg, Brian G. Kral, Changwei Li, Yun Li, Honghuang Lin, Ching-Ti Liu, Ruth J. F. Loos, Michael C. Mahaney, Lisa W. Martin, Rasika A. Mathias, Braxton D. Mitchell, May E. Montasser, Alanna C. Morrison, Take Naseri, Kari E. North, Nicholette D. Palmer, Patricia A. Peyser, Bruce M. Psaty, Susan Redline, Alexander P. Reiner, Stephen S. Rich, Colleen M. Sitlani, Jennifer A. Smith, Kent D. Taylor, Hemant K. Tiwari, Ramachandran S. Vasan, Satupa’itea Viali, Zhe Wang, Jennifer Wessel, Lisa R. Yanek, Bing Yu, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, Josée Dupuis, James B. Meigs, Paul L. Auer, Laura M. Raffield, Alisa K. Manning, Kenneth M. Rice, Jerome I. Rotter, Gina M. Peloso, Pradeep Natarajan, Zilin Li, Zhonghua Liu, Xihong Lin
{"title":"A statistical framework for multi-trait rare variant analysis in large-scale whole-genome sequencing studies","authors":"Xihao Li, Han Chen, Margaret Sunitha Selvaraj, Eric Van Buren, Hufeng Zhou, Yuxuan Wang, Ryan Sun, Zachary R. McCaw, Zhi Yu, Min-Zhi Jiang, Daniel DiCorpo, Sheila M. Gaynor, Rounak Dey, Donna K. Arnett, Emelia J. Benjamin, Joshua C. Bis, John Blangero, Eric Boerwinkle, Donald W. Bowden, Jennifer A. Brody, Brian E. Cade, April P. Carson, Jenna C. Carlson, Nathalie Chami, Yii-Der Ida Chen, Joanne E. Curran, Paul S. de Vries, Myriam Fornage, Nora Franceschini, Barry I. Freedman, Charles Gu, Nancy L. Heard-Costa, Jiang He, Lifang Hou, Yi-Jen Hung, Marguerite R. Irvin, Robert C. Kaplan, Sharon L. R. Kardia, Tanika N. Kelly, Iain Konigsberg, Charles Kooperberg, Brian G. Kral, Changwei Li, Yun Li, Honghuang Lin, Ching-Ti Liu, Ruth J. F. Loos, Michael C. Mahaney, Lisa W. Martin, Rasika A. Mathias, Braxton D. Mitchell, May E. Montasser, Alanna C. Morrison, Take Naseri, Kari E. North, Nicholette D. Palmer, Patricia A. Peyser, Bruce M. Psaty, Susan Redline, Alexander P. Reiner, Stephen S. Rich, Colleen M. Sitlani, Jennifer A. Smith, Kent D. Taylor, Hemant K. Tiwari, Ramachandran S. Vasan, Satupa’itea Viali, Zhe Wang, Jennifer Wessel, Lisa R. Yanek, Bing Yu, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, Josée Dupuis, James B. Meigs, Paul L. Auer, Laura M. Raffield, Alisa K. Manning, Kenneth M. Rice, Jerome I. Rotter, Gina M. Peloso, Pradeep Natarajan, Zilin Li, Zhonghua Liu, Xihong Lin","doi":"10.1038/s43588-024-00764-8","DOIUrl":"10.1038/s43588-024-00764-8","url":null,"abstract":"Large-scale whole-genome sequencing (WGS) studies have improved our understanding of the contributions of coding and noncoding rare variants to complex human traits. Leveraging association effect sizes across multiple traits in WGS rare variant association analysis can improve statistical power over single-trait analysis, and also detect pleiotropic genes and regions. Existing multi-trait methods have limited ability to perform rare variant analysis of large-scale WGS data. We propose MultiSTAAR, a statistical framework and computationally scalable analytical pipeline for functionally informed multi-trait rare variant analysis in large-scale WGS studies. MultiSTAAR accounts for relatedness, population structure and correlation among phenotypes by jointly analyzing multiple traits, and further empowers rare variant association analysis by incorporating multiple functional annotations. We applied MultiSTAAR to jointly analyze three lipid traits in 61,838 multi-ethnic samples from the Trans-Omics for Precision Medicine (TOPMed) Program. We discovered and replicated new associations with lipid traits missed by single-trait analysis. MultiSTAAR provides a general and flexible statistical framework for functionally informed multi-trait rare variant analysis of biobank-scale sequencing studies by jointly analyzing multiple traits and incorporating annotation information.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 2","pages":"125-143"},"PeriodicalIF":12.0,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143371272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MultiSTAAR delivers multi-trait rare variant analysis of biobank-scale sequencing data","authors":"","doi":"10.1038/s43588-025-00766-0","DOIUrl":"10.1038/s43588-025-00766-0","url":null,"abstract":"Identifying pleiotropic associations for rare variants in multi-ethnic biobank-scale whole-genome sequencing data poses considerable challenges. This study introduced MultiSTAAR as a scalable and robust multi-trait rare variant analysis framework designed for both coding and noncoding regions by integrating multiple variant functional annotations and leveraging multivariate modeling across diverse phenotypes.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 2","pages":"101-102"},"PeriodicalIF":12.0,"publicationDate":"2025-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143371350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Balancing autonomy and expertise in autonomous synthesis laboratories","authors":"Xiaozhao Liu, Bin Ouyang, Yan Zeng","doi":"10.1038/s43588-025-00769-x","DOIUrl":"10.1038/s43588-025-00769-x","url":null,"abstract":"Autonomous synthesis laboratories promise to streamline the plan–make–measure–analyze iteration loop. Here, we comment on the barriers in the field, the promise of a human on-the-loop approach, and strategies for optimizing accessibility, accuracy, and efficiency of autonomous laboratories.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 2","pages":"92-94"},"PeriodicalIF":12.0,"publicationDate":"2025-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143257622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Shedding light on spatial signal transduction in cells using computational simulations","authors":"","doi":"10.1038/s43588-025-00772-2","DOIUrl":"10.1038/s43588-025-00772-2","url":null,"abstract":"We present Spatial Modeling Algorithms for Reactions and Transport (SMART), a software package that simulates spatiotemporally detailed biochemical reaction networks within realistic cellular and subcellular geometries. This paper highlights the use of SMART in several biological test cases including cellular mechanotransduction, calcium signaling in neurons and cardiomyocytes, and adenosine triphosphate synthesis.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 2","pages":"99-100"},"PeriodicalIF":12.0,"publicationDate":"2025-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143191505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Emmet A. Francis, Justin G. Laughlin, Jørgen S. Dokken, Henrik N. T. Finsberg, Christopher T. Lee, Marie E. Rognes, Padmini Rangamani
{"title":"Author Correction: Spatial modeling algorithms for reactions and transport in biological cells","authors":"Emmet A. Francis, Justin G. Laughlin, Jørgen S. Dokken, Henrik N. T. Finsberg, Christopher T. Lee, Marie E. Rognes, Padmini Rangamani","doi":"10.1038/s43588-025-00773-1","DOIUrl":"10.1038/s43588-025-00773-1","url":null,"abstract":"","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 2","pages":"185-185"},"PeriodicalIF":12.0,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s43588-025-00773-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143124138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Biologically inspired graphs to explore massive genetic datasets","authors":"Ryan M. Layer","doi":"10.1038/s43588-024-00763-9","DOIUrl":"10.1038/s43588-024-00763-9","url":null,"abstract":"A recent study proposes a data structure that addresses crucial challenges related to storage and computation of large genome databases.","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":"5 2","pages":"97-98"},"PeriodicalIF":12.0,"publicationDate":"2025-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143076347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Multimodal learning for mapping genotype-phenotype dynamics.","authors":"Farhan Khodaee, Rohola Zandie, Elazer R Edelman","doi":"10.1038/s43588-024-00765-7","DOIUrl":"10.1038/s43588-024-00765-7","url":null,"abstract":"<p><p>How complex phenotypes emerge from intricate gene expression patterns is a fundamental question in biology. Integrating high-content genotyping approaches such as single-cell RNA sequencing and advanced learning methods such as language models offers an opportunity for dissecting this complex relationship. Here we present a computational integrated genetics framework designed to analyze and interpret the high-dimensional landscape of genotypes and their associated phenotypes simultaneously. We applied this approach to develop a multimodal foundation model to explore the genotype-phenotype relationship manifold for human transcriptomics at the cellular level. Analyzing this joint manifold showed a refined resolution of cellular heterogeneity, uncovered potential cross-tissue biomarkers and provided contextualized embeddings to investigate the polyfunctionality of genes shown for the von Willebrand factor (VWF) gene in endothelial cells. Overall, this study advances our understanding of the dynamic interplay between gene expression and phenotypic manifestation and demonstrates the potential of integrated genetics in uncovering new dimensions of cellular function and complexity.</p>","PeriodicalId":74246,"journal":{"name":"Nature computational science","volume":" ","pages":""},"PeriodicalIF":12.0,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143061684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}