Nature Machine Intelligence最新文献_第9页

Compositional pretraining improves computational efficiency and matches animal behaviour on complex tasks 组合预训练提高了计算效率，并在复杂任务中匹配动物行为

IF 23.9 1区计算机科学

Nature Machine Intelligence Pub Date : 2025-05-19 DOI: 10.1038/s42256-025-01029-3

David Hocker, Christine M. Constantinople, Cristina Savin

{"title":"Compositional pretraining improves computational efficiency and matches animal behaviour on complex tasks","authors":"David Hocker, Christine M. Constantinople, Cristina Savin","doi":"10.1038/s42256-025-01029-3","DOIUrl":"10.1038/s42256-025-01029-3","url":null,"abstract":"Recurrent neural networks (RNNs) are ubiquitously used in neuroscience to capture both neural dynamics and behaviours of living systems. However, when it comes to complex cognitive tasks, training RNNs with traditional methods can prove difficult and fall short of capturing crucial aspects of animal behaviour. Here we propose a principled approach for identifying and incorporating compositional tasks as part of RNN training. Taking as the target a temporal wagering task previously studied in rats, we design a pretraining curriculum of simpler cognitive tasks that reflect relevant subcomputations, which we term ‘kindergarten curriculum learning’. We show that this pretraining substantially improves learning efficacy and is critical for RNNs to adopt similar strategies as rats, including long-timescale inference of latent states, which conventional pretraining approaches fail to capture. Mechanistically, our pretraining supports the development of slow dynamical systems features needed for implementing both inference and value-based decision making. Overall, our approach helps endow RNNs with relevant inductive biases, which is important when modelling complex behaviours that rely on multiple cognitive functions. Hocker et al. demonstrate a method for training recurrent neural networks, which they call ‘kindergarten curriculum learning’, involving pretraining on simple cognitive tasks to improve learning efficiency. This approach helps recurrent neural networks to mimic animal behaviour in solving complex tasks.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 5","pages":"689-702"},"PeriodicalIF":23.9,"publicationDate":"2025-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144088332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Back to recurrent processing at the crossroad of transformers and state-space models 回到变压器和状态空间模型交叉路口的循环处理

IF 23.9 1区计算机科学

Nature Machine Intelligence Pub Date : 2025-05-15 DOI: 10.1038/s42256-025-01034-6

Matteo Tiezzi, Michele Casoni, Alessandro Betti, Tommaso Guidi, Marco Gori, Stefano Melacci

{"title":"Back to recurrent processing at the crossroad of transformers and state-space models","authors":"Matteo Tiezzi, Michele Casoni, Alessandro Betti, Tommaso Guidi, Marco Gori, Stefano Melacci","doi":"10.1038/s42256-025-01034-6","DOIUrl":"10.1038/s42256-025-01034-6","url":null,"abstract":"It is a longstanding challenge for the machine learning community to develop models that are capable of processing and learning from long sequences of data. The exceptional results of transformer-based approaches, such as large language models, promote the idea of parallel attention as the key to succeed in such a challenge, temporarily obscuring the role of classic sequential processing of recurrent models. However, in the past few years, a new generation of neural models has emerged, combining transformers and recurrent networks motivated by concerns over the quadratic complexity of self-attention. Meanwhile, (deep) state-space models have also emerged as robust approaches to function approximation over time, thus opening a new perspective in learning from sequential data. Here we provide an overview of these trends unified under the umbrella of recurrent models, and discuss their likely crucial impact in the development of future architectures for large generative models. While transformers and large language models excel at efficiently processing long sequences, new approaches have been proposed that incorporate recurrence to overcome the quadratic cost of self-attention. Tiezzi et al. discuss recurrent and state-space models and the promise they hold for future sequence processing networks.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 5","pages":"678-688"},"PeriodicalIF":23.9,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143979612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bridging chemistry and artificial intelligence by a reaction description language 用反应描述语言架起化学与人工智能的桥梁

IF 23.9 1区计算机科学

Nature Machine Intelligence Pub Date : 2025-05-13 DOI: 10.1038/s42256-025-01032-8

Jiacheng Xiong, Wei Zhang, Yinquan Wang, Jiatao Huang, Yuqi Shi, Mingyan Xu, Manjia Li, Zunyun Fu, Xiangtai Kong, Yitian Wang, Zhaoping Xiong, Mingyue Zheng

{"title":"Bridging chemistry and artificial intelligence by a reaction description language","authors":"Jiacheng Xiong, Wei Zhang, Yinquan Wang, Jiatao Huang, Yuqi Shi, Mingyan Xu, Manjia Li, Zunyun Fu, Xiangtai Kong, Yitian Wang, Zhaoping Xiong, Mingyue Zheng","doi":"10.1038/s42256-025-01032-8","DOIUrl":"10.1038/s42256-025-01032-8","url":null,"abstract":"With the fast-paced development of artificial intelligence, large language models are increasingly used to tackle various scientific challenges. A critical step in this process is converting domain-specific data into a sequence of tokens for language modelling. In chemistry, molecules are often represented by molecular linear notations, and chemical reactions are depicted as sequence pairs of reactants and products. However, this approach does not capture atomic and bond changes during reactions. Here, we present ReactSeq, a reaction description language that defines molecular editing operations for step-by-step chemical transformation. Based on ReactSeq, language models for retrosynthesis prediction may consistently excel in all benchmark tests, and demonstrate promising emergent abilities in the human-in-the-loop and explainable artificial intelligence. Moreover, ReactSeq has allowed us to obtain universal and reliable representations of chemical reactions, which enable navigation of the reaction space and aid in the recommendation of experimental procedures and prediction of reaction yields. We foresee that ReactSeq can serve as a bridge to narrow the gap between chemistry and artificial intelligence. Xiong et al. introduce ReactSeq, a reaction description language that captures molecular editing operations in chemical reactions. It enables language models to excel in retrosynthesis prediction and reaction representation.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 5","pages":"782-793"},"PeriodicalIF":23.9,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143940633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Generating 3D small binding molecules using shape-conditioned diffusion models with guidance 生成三维小结合分子使用形状条件扩散模型与指导

IF 23.9 1区计算机科学

Nature Machine Intelligence Pub Date : 2025-05-12 DOI: 10.1038/s42256-025-01030-w

Ziqi Chen, Bo Peng, Tianhua Zhai, Daniel Adu-Ampratwum, Xia Ning

{"title":"Generating 3D small binding molecules using shape-conditioned diffusion models with guidance","authors":"Ziqi Chen, Bo Peng, Tianhua Zhai, Daniel Adu-Ampratwum, Xia Ning","doi":"10.1038/s42256-025-01030-w","DOIUrl":"10.1038/s42256-025-01030-w","url":null,"abstract":"Drug development is a critical but notoriously resource- and time-consuming process. Traditional methods, such as high-throughput screening, rely on opportunistic trial and error and cannot ensure optimal precision design. To overcome these challenges, generative artificial intelligence methods have emerged to directly design molecules with desired properties. Here we develop a generative artificial intelligence method DiffSMol for drug discovery that generates 3D small binding molecules based on known ligand shapes. DiffSMol encapsulates ligand shape details within pretrained, expressive shape embeddings and generates binding molecules through a diffusion model. DiffSMol further modifies the generated 3D structures iteratively using shape guidance to better resemble ligand shapes, and protein pocket guidance to optimize binding affinities. We show that DiffSMol outperforms state-of-the-art methods on benchmark datasets. When generating binding molecules resembling ligand shapes, DiffSMol with shape guidance achieves a success rate 61.4%, substantially outperforming the best baseline (11.2%), meanwhile producing molecules with de novo graph structures. DiffSMol with pocket guidance also outperforms the best baseline in binding affinities by 13.2%, and even by 17.7% when combined with shape guidance. Case studies for two critical drug targets demonstrate very favourable physicochemical and pharmacokinetic properties of generated molecules, highlighting the potential of DiffSMol in developing promising drug candidates. Chen et al. introduce an AI-driven method that generates 3D small-molecule drug candidates and outperforms existing approaches by leveraging ligand shape information.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 5","pages":"758-770"},"PeriodicalIF":23.9,"publicationDate":"2025-05-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143933591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Lossless data compression by large models 大型模型的无损数据压缩

IF 23.9 1区计算机科学

Nature Machine Intelligence Pub Date : 2025-05-01 DOI: 10.1038/s42256-025-01033-7

Ziguang Li, Chao Huang, Xuliang Wang, Haibo Hu, Cole Wyeth, Dongbo Bu, Quan Yu, Wen Gao, Xingwu Liu, Ming Li

{"title":"Lossless data compression by large models","authors":"Ziguang Li, Chao Huang, Xuliang Wang, Haibo Hu, Cole Wyeth, Dongbo Bu, Quan Yu, Wen Gao, Xingwu Liu, Ming Li","doi":"10.1038/s42256-025-01033-7","DOIUrl":"10.1038/s42256-025-01033-7","url":null,"abstract":"Data compression is a fundamental technology that enables efficient storage and transmission of information. However, traditional compression methods are approaching their theoretical limits after 80 years of research and development. At the same time, large artificial intelligence models have emerged, which, trained on vast amounts of data, are able to ‘understand’ various semantics. Intuitively, semantics conveys the meaning of data concisely, so large models hold the potential to revolutionize compression technology. Here we present LMCompress, a new method that leverages large models to compress data. LMCompress shatters all previous lossless compression records on four media types: text, images, video and audio. It halves the compression rates of JPEG-XL for images, FLAC for audio and H.264 for video, and it achieves nearly one-third of the compression rates of zpaq for text. Our results demonstrate that the better a model understands the data, the more effectively it can compress it, suggesting a deep connection between understanding and compression. Effective lossless compression requires that frequent patterns in the data can be identified. Li et al. explore using deep learning models to more effectively compress text, audio and video data.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 5","pages":"794-799"},"PeriodicalIF":23.9,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143893302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Machine learning prediction of enzyme optimum pH 酶最适pH的机器学习预测

IF 23.9 1区计算机科学

Nature Machine Intelligence Pub Date : 2025-04-29 DOI: 10.1038/s42256-025-01026-6

Japheth E. Gado, Matthew Knotts, Ada Y. Shaw, Debora Marks, Nicholas P. Gauthier, Chris Sander, Gregg T. Beckham

{"title":"Machine learning prediction of enzyme optimum pH","authors":"Japheth E. Gado, Matthew Knotts, Ada Y. Shaw, Debora Marks, Nicholas P. Gauthier, Chris Sander, Gregg T. Beckham","doi":"10.1038/s42256-025-01026-6","DOIUrl":"10.1038/s42256-025-01026-6","url":null,"abstract":"The relationship between pH and enzyme catalytic activity, especially the optimal pH (pHopt) at which enzymes function, is critical for biotechnological applications. Hence, computational methods to predict pHopt will enhance enzyme discovery and design by facilitating accurate identification of enzymes that function optimally at specific pH levels, and by elucidating sequence–function relationships. Here we proposed and evaluated various machine learning methods for predicting pHopt, conducting extensive hyperparameter optimization and training over 11,000 model instances. Our results demonstrate that models utilizing language model embeddings markedly outperform other methods in predicting pHopt. We present EpHod, the best-performing model, to predict pHopt, making it publicly available to researchers. From sequence data, EpHod directly learns structural and biophysical features that relate to pHopt, including proximity of residues to the catalytic centre and the accessibility of solvent molecules. Overall, EpHod presents a promising advancement in pHopt prediction and will potentially speed up the development of enzyme technologies. Accurately predicting the optimal pH level for enzyme activity is challenging due to the complex relationship between enzyme structure and function. Gado and colleagues show that a language model can effectively learn the structural and biophysical features to predict the optimal pH for enzyme activity.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 5","pages":"716-729"},"PeriodicalIF":23.9,"publicationDate":"2025-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143884858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Robot planning with LLMs 机器人规划与法学硕士

IF 23.9 1区计算机科学

Nature Machine Intelligence Pub Date : 2025-04-23 DOI: 10.1038/s42256-025-01036-4

引用次数: 0

Optimal transport for generating transition states in chemical reactions 化学反应中产生过渡态的最佳输运

IF 23.9 1区计算机科学

Nature Machine Intelligence Pub Date : 2025-04-23 DOI: 10.1038/s42256-025-01010-0

Chenru Duan, Guan-Horng Liu, Yuanqi Du, Tianrong Chen, Qiyuan Zhao, Haojun Jia, Carla P. Gomes, Evangelos A. Theodorou, Heather J. Kulik

{"title":"Optimal transport for generating transition states in chemical reactions","authors":"Chenru Duan, Guan-Horng Liu, Yuanqi Du, Tianrong Chen, Qiyuan Zhao, Haojun Jia, Carla P. Gomes, Evangelos A. Theodorou, Heather J. Kulik","doi":"10.1038/s42256-025-01010-0","DOIUrl":"10.1038/s42256-025-01010-0","url":null,"abstract":"Transition states (TSs) are transient structures that are key to understanding reaction mechanisms and designing catalysts but challenging to capture in experiments. Many optimization algorithms have been developed to search for TSs computationally. Yet, the cost of these algorithms driven by quantum chemistry methods (usually density functional theory) is still high, posing challenges for their applications in building large reaction networks for reaction exploration. Here we developed React-OT, an optimal transport approach for generating unique TS structures from reactants and products. React-OT generates highly accurate TS structures with a median structural root mean square deviation of 0.053 Å and median barrier height error of 1.06 kcal mol−1 requiring only 0.4 s per reaction. The root mean square deviation and barrier height error are further improved by roughly 25% through pretraining React-OT on a large reaction dataset obtained with a lower level of theory, GFN2-xTB. We envision that the remarkable accuracy and rapid inference of React-OT will be highly useful when integrated with the current high-throughput TS search workflow. This integration will facilitate the exploration of chemical reactions with unknown mechanisms. Duan et al. introduce an optimal transport approach to generate transition states, surpassing diffusion models in precision and speed. This method can facilitate the study of chemical reactions with unknown mechanisms.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 4","pages":"615-626"},"PeriodicalIF":23.9,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01010-0.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143867037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Personalized uncertainty quantification in artificial intelligence 人工智能中的个性化不确定性量化

IF 23.9 1区计算机科学

Nature Machine Intelligence Pub Date : 2025-04-23 DOI: 10.1038/s42256-025-01024-8

Tapabrata Chakraborti, Christopher R. S. Banerji, Ariane Marandon, Vicky Hellon, Robin Mitra, Brieuc Lehmann, Leandra Bräuninger, Sarah McGough, Cagatay Turkay, Alejandro F. Frangi, Ginestra Bianconi, Weizi Li, Owen Rackham, Deepak Parashar, Chris Harbron, Ben MacArthur

{"title":"Personalized uncertainty quantification in artificial intelligence","authors":"Tapabrata Chakraborti, Christopher R. S. Banerji, Ariane Marandon, Vicky Hellon, Robin Mitra, Brieuc Lehmann, Leandra Bräuninger, Sarah McGough, Cagatay Turkay, Alejandro F. Frangi, Ginestra Bianconi, Weizi Li, Owen Rackham, Deepak Parashar, Chris Harbron, Ben MacArthur","doi":"10.1038/s42256-025-01024-8","DOIUrl":"10.1038/s42256-025-01024-8","url":null,"abstract":"Artificial intelligence (AI) tools are increasingly being used to help make consequential decisions about individuals. While AI models may be accurate on average, they can simultaneously be highly uncertain about outcomes associated with specific individuals or groups of individuals. For high-stakes applications (such as healthcare and medicine, defence and security, banking and finance), AI decision-support systems must be able to make personalized assessments of uncertainty in a rigorous manner. However, the statistical frameworks needed to do so are currently incomplete. Here, we outline current approaches to personalized uncertainty quantification (PUQ) and define a set of grand challenges associated with the development and use of PUQ in a range of areas, including multimodal AI, explainable AI, generative AI and AI fairness. AI tools are increasingly used for important decisions, but they can be uncertain about specific individuals or groups. Chakraborty et al. discuss the need for better methods to assess uncertainty in high-stakes applications such as healthcare and finance, and outline a set of main challenges to provide practical guidance for AI researchers.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 4","pages":"522-530"},"PeriodicalIF":23.9,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143867038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Sparse and transferable three-dimensional dynamic vascular reconstruction for instantaneous diagnosis 用于瞬时诊断的稀疏和可转移的三维动态血管重建

IF 23.9 1区计算机科学

Nature Machine Intelligence Pub Date : 2025-04-21 DOI: 10.1038/s42256-025-01025-7

Yinheng Zhu, Yong Wang, Chunxia Di, Hanghang Liu, Fangzhou Liao, Shaohua Ma

{"title":"Sparse and transferable three-dimensional dynamic vascular reconstruction for instantaneous diagnosis","authors":"Yinheng Zhu, Yong Wang, Chunxia Di, Hanghang Liu, Fangzhou Liao, Shaohua Ma","doi":"10.1038/s42256-025-01025-7","DOIUrl":"10.1038/s42256-025-01025-7","url":null,"abstract":"Three-dimensional (3D) structural information of cardiac vessels is crucial for the diagnosis and treatment of cardiovascular disease. In clinical practice, interventionalists have to empirically infer 3D cardiovascular topology from multi-view X-ray angiography images, which is time-consuming and requires extensive experience. Owing to the dynamic nature of heartbeats and sparse-view observations in clinical practice, accurate and efficient reconstruction of 3D cardiovascular structures from X-ray angiography images remains challenging. Here we introduce AutoCAR, a fully automated transfer learning-based algorithm for dynamic 3D cardiovascular reconstruction. AutoCAR comprises three main components: pose domain adaptation, sparse backwards projection and vascular graph optimization. By merging the X-ray angiography imaging parameter statistics of over 1,000 clinical cases into synthetic data generation, and exploiting the intrinsic spatial sparsity of cardiac vessels for computational design, AutoCAR outperforms state-of-the-art methods in both qualitative and quantitative evaluations, enabling dynamic cardiovascular reconstruction in real-world clinical settings. We envision that AutoCAR will facilitate current diagnostic and intervention procedures and pave the way for real-time visual guidance and autonomous catheter navigation in cardiac intervention. Yinheng Zhu et al. present AutoCAR, an automated algorithm for reconstructing three-dimensional cardiovascular structures from X-ray images. It uses transfer learning and vascular graph optimization to achieve high efficiency and accuracy, with the goal to enable medical procedures and diagnosis in real-world settings.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 5","pages":"730-742"},"PeriodicalIF":23.9,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143853418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0