Guy Davidson, Graham Todd, Julian Togelius, Todd M. Gureckis, Brenden M. Lake
{"title":"Goals as reward-producing programs","authors":"Guy Davidson, Graham Todd, Julian Togelius, Todd M. Gureckis, Brenden M. Lake","doi":"10.1038/s42256-025-00981-4","DOIUrl":"10.1038/s42256-025-00981-4","url":null,"abstract":"People are remarkably capable of generating their own goals, beginning with child’s play and continuing into adulthood. Despite considerable empirical and computational work on goals and goal-oriented behaviour, models are still far from capturing the richness of everyday human goals. Here we bridge this gap by collecting a dataset of human-generated playful goals (in the form of scorable, single-player games), modelling them as reward-producing programs and generating novel human-like goals through program synthesis. Reward-producing programs capture the rich semantics of goals through symbolic operations that compose, add temporal constraints and allow program execution on behavioural traces to evaluate progress. To build a generative model of goals, we learn a fitness function over the infinite set of possible goal programs and sample novel goals with a quality-diversity algorithm. Human evaluators found that model-generated goals, when sampled from partitions of program space occupied by human examples, were indistinguishable from human-created games. We also discovered that our model’s internal fitness scores predict games that are evaluated as more fun to play and more human-like. To enable artificial agents to generate human-like goals, a model must capture the complexity and diversity of human goals. Davidson et al. model playful goals from a naturalistic experiment as reward-producing programs, mapping an agent’s behaviour to goal success. They then develop a computational model to generate diverse human-like goals.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 2","pages":"205-220"},"PeriodicalIF":18.8,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143462924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scalable and robust DNA-based storage via coding theory and deep learning","authors":"Daniella Bar-Lev, Itai Orr, Omer Sabary, Tuvi Etzion, Eitan Yaakobi","doi":"10.1038/s42256-025-01003-z","DOIUrl":"https://doi.org/10.1038/s42256-025-01003-z","url":null,"abstract":"<p>The global data sphere is expanding exponentially, projected to hit 180 zettabytes by 2025, whereas current technologies are not anticipated to scale at nearly the same rate. DNA-based storage emerges as a crucial solution to this gap, enabling digital information to be archived in DNA molecules. This method enjoys major advantages over magnetic and optical storage solutions such as exceptional information density, enhanced data durability and negligible power consumption to maintain data integrity. To access the data, an information retrieval process is employed, where some of the main bottlenecks are the scalability and accuracy, which have a natural tradeoff between the two. Here we show a modular and holistic approach that combines deep neural networks trained on simulated data, tensor product-based error-correcting codes and a safety margin mechanism into a single coherent pipeline. We demonstrated our solution on 3.1 MB of information using two different sequencing technologies. Our work improves upon the current leading solutions with a 3,200× increase in speed and a 40% improvement in accuracy and offers a code rate of 1.6 bits per base in a high-noise regime. In a broader sense, our work shows a viable path to commercial DNA storage solutions hindered by current information retrieval processes.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143462923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Robin Jeanne Kirschner, Kübra Karacan, Alessandro Melone, Sami Haddadin
{"title":"Categorizing robots by performance fitness into the tree of robots","authors":"Robin Jeanne Kirschner, Kübra Karacan, Alessandro Melone, Sami Haddadin","doi":"10.1038/s42256-025-00995-y","DOIUrl":"10.1038/s42256-025-00995-y","url":null,"abstract":"Robots are typically classified based on specific morphological features, like their kinematic structure. However, a complex interplay between morphology and intelligence shapes how well a robot performs processes. Just as delicate surgical procedures demand high dexterity and tactile precision, manual warehouse or construction work requires strength and endurance. These process requirements necessitate robot systems that provide a level of performance fitting the process. In this work, we introduce the tree of robots as a taxonomy to bridge the gap between morphological classification and process-based performance. It classifies robots based on their fitness to perform, for example, physical interaction processes. Using 11 industrial manipulators, we constructed the first part of the tree of robots based on a carefully deduced set of metrics reflecting fundamental robot capabilities for various industrial physical interaction processes. Through significance analysis, we identified substantial differences between the systems, grouping them via an expectation-maximization algorithm to create a fitness-based robot classification that is open for contributions and accessible. It is challenging to compare how well robots perform a task, as the evaluation depends on the process and skills required. It is proposed to group robots into a taxonomy based on their performance on a set of embodied skill benchmarks.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 3","pages":"459-470"},"PeriodicalIF":18.8,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-025-00995-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143462928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shicheng Chen, Odin Zhang, Chenran Jiang, Huifeng Zhao, Xujun Zhang, Mengting Chen, Yun Liu, Qun Su, Zhenxing Wu, Xinyue Wang, Wanglin Qu, Yuanyi Ye, Xin Chai, Ning Wang, Tianyue Wang, Yuan An, Guanlin Wu, Qianqian Yang, Jiean Chen, Wei Xie, Haitao Lin, Dan Li, Chang-Yu Hsieh, Yong Huang, Yu Kang, Tingjun Hou, Peichen Pan
{"title":"Deep lead optimization enveloped in protein pocket and its application in designing potent and selective ligands targeting LTK protein","authors":"Shicheng Chen, Odin Zhang, Chenran Jiang, Huifeng Zhao, Xujun Zhang, Mengting Chen, Yun Liu, Qun Su, Zhenxing Wu, Xinyue Wang, Wanglin Qu, Yuanyi Ye, Xin Chai, Ning Wang, Tianyue Wang, Yuan An, Guanlin Wu, Qianqian Yang, Jiean Chen, Wei Xie, Haitao Lin, Dan Li, Chang-Yu Hsieh, Yong Huang, Yu Kang, Tingjun Hou, Peichen Pan","doi":"10.1038/s42256-025-00997-w","DOIUrl":"10.1038/s42256-025-00997-w","url":null,"abstract":"Optimizing the chemical structure of promising drug candidates through systematic modifications to improve potency and physiochemical properties is a vital step in the drug discovery pipeline. In contrast to the well-established de novo generation schemes, computational methods specifically tailored for lead optimization remain largely underexplored. Prior models are often limited to addressing specific subtasks, such as generating two-dimensional molecular structures, while neglecting crucial protein–ligand interactions in three-dimensional space. To overcome these challenges, we propose Delete (Deep lead optimization enveloped in protein pocket), a one-stop solution for lead optimization by combining generative artificial intelligence and structure-based approaches. Our model can handle all subtasks of lead optimization through a unified deleting (masking) strategy, and it accounts for intricate pocket–ligand interactions through an equivariant network design. Statistical assessments and retrospective studies across individual subtasks demonstrate that Delete has an outstanding ability to craft molecules with superior protein-binding energy and reasonable drug-likeness using given fragments or atoms. Subsequently, we utilize Delete to design inhibitors targeting the previously identified LTK protein. Among the ligands designed by Delete, CA-B-1 is successfully validated as a potent (1.36 nM) and selective inhibitor by in vitro and in vivo experiments. This work represents a successful implementation of the powerful structure-based lead optimization model, Delete, for rapid and controllable rational drug design. Chen et al. present a deep learning-based lead optimization model that combines generative artificial intelligence with structure-based approaches. The method is successfully applied to the design of drug-like molecules targeting the recently identified LTK protein target with high potency and selectivity.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 3","pages":"448-458"},"PeriodicalIF":18.8,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143452007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bridging peptide presentation and T cell recognition with multi-task learning","authors":"Li Su, Duolin Wang, Dong Xu","doi":"10.1038/s42256-025-01004-y","DOIUrl":"10.1038/s42256-025-01004-y","url":null,"abstract":"The immunogenic binding interactions of antigens are complex and interconnected. A new transformer-based model can simultaneously predict the bindings of antigens to two main receptors.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 2","pages":"170-171"},"PeriodicalIF":18.8,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143452005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Physical benchmarks for testing algorithms","authors":"Jakob Zeitler","doi":"10.1038/s42256-025-00999-8","DOIUrl":"10.1038/s42256-025-00999-8","url":null,"abstract":"The development of comprehensive benchmarks to assess the performance of algorithms on causal tasks is an important, emerging area. The introduction of two physical ‘causal chamber’ systems serves as a firm step towards future, more reliable benchmarks in the field.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 2","pages":"166-167"},"PeriodicalIF":18.8,"publicationDate":"2025-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143452006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Angelina Wang, Jamie Morgenstern, John P. Dickerson
{"title":"Large language models that replace human participants can harmfully misportray and flatten identity groups","authors":"Angelina Wang, Jamie Morgenstern, John P. Dickerson","doi":"10.1038/s42256-025-00986-z","DOIUrl":"10.1038/s42256-025-00986-z","url":null,"abstract":"Large language models (LLMs) are increasing in capability and popularity, propelling their application in new domains—including as replacements for human participants in computational social science, user testing, annotation tasks and so on. In many settings, researchers seek to distribute their surveys to a sample of participants that are representative of the underlying human population of interest. This means that to be a suitable replacement, LLMs will need to be able to capture the influence of positionality (that is, the relevance of social identities like gender and race). However, we show that there are two inherent limitations in the way current LLMs are trained that prevent this. We argue analytically for why LLMs are likely to both misportray and flatten the representations of demographic groups, and then empirically show this on four LLMs through a series of human studies with 3,200 participants across 16 demographic identities. We also discuss a third limitation about how identity prompts can essentialize identities. Throughout, we connect each limitation to a pernicious history of epistemic injustice against the value of lived experiences that explains why replacement is harmful for marginalized demographic groups. Overall, we urge caution in use cases in which LLMs are intended to replace human participants whose identities are relevant to the task at hand. At the same time, in cases where the benefits of LLM replacement are determined to outweigh the harms (for example, engaging human participants may cause them harm, or the goal is to supplement rather than fully replace), we empirically demonstrate that our inference-time techniques reduce—but do not remove—these harms. Large language models are being considered to simulate responses from participants of different backgrounds in computational social science experiments. Here it is shown that this practice can misportray and flatten demographic groups in distinctively harmful ways.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 3","pages":"400-411"},"PeriodicalIF":18.8,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143427028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sijia Liu, Yuanshun Yao, Jinghan Jia, Stephen Casper, Nathalie Baracaldo, Peter Hase, Yuguang Yao, Chris Yuhao Liu, Xiaojun Xu, Hang Li, Kush R. Varshney, Mohit Bansal, Sanmi Koyejo, Yang Liu
{"title":"Rethinking machine unlearning for large language models","authors":"Sijia Liu, Yuanshun Yao, Jinghan Jia, Stephen Casper, Nathalie Baracaldo, Peter Hase, Yuguang Yao, Chris Yuhao Liu, Xiaojun Xu, Hang Li, Kush R. Varshney, Mohit Bansal, Sanmi Koyejo, Yang Liu","doi":"10.1038/s42256-025-00985-0","DOIUrl":"10.1038/s42256-025-00985-0","url":null,"abstract":"We explore machine unlearning in the domain of large language models (LLMs), referred to as LLM unlearning. This initiative aims to eliminate undesirable data influence (for example, sensitive or illegal information) and the associated model capabilities, while maintaining the integrity of essential knowledge generation and not affecting causally unrelated information. We envision LLM unlearning becoming a pivotal element in the life-cycle management of LLMs, potentially standing as an essential foundation for developing generative artificial intelligence that is not only safe, secure and trustworthy but also resource-efficient without the need for full retraining. We navigate the unlearning landscape in LLMs from conceptual formulation, methodologies, metrics and applications. In particular, we highlight the often-overlooked aspects of existing LLM unlearning research, for example, unlearning scope, data–model interaction and multifaceted efficacy assessment. We also draw connections between LLM unlearning and related areas such as model editing, influence functions, model explanation, adversarial training and reinforcement learning. Furthermore, we outline an effective assessment framework for LLM unlearning and explore its applications in copyright and privacy safeguards and sociotechnical harm reduction. Machine unlearning techniques remove undesirable data and associated model capabilities while preserving essential knowledge, so that machine learning models can be updated without costly retraining. Liu et al. review recent advances and opportunities in machine unlearning in LLMs, revisiting methodologies and overlooked principles for future improvements and exploring emerging applications in copyright and privacy safeguards and in reducing sociotechnical harms.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 2","pages":"181-194"},"PeriodicalIF":18.8,"publicationDate":"2025-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143427029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jesus de la Fuente, Guillermo Serrano, Uxía Veleiro, Mikel Casals, Laura Vera, Marija Pizurica, Nuria Gómez-Cebrián, Leonor Puchades-Carrasco, Antonio Pineda-Lucena, Idoia Ochoa, Silve Vicent, Olivier Gevaert, Mikel Hernaez
{"title":"Towards a more inductive world for drug repurposing approaches","authors":"Jesus de la Fuente, Guillermo Serrano, Uxía Veleiro, Mikel Casals, Laura Vera, Marija Pizurica, Nuria Gómez-Cebrián, Leonor Puchades-Carrasco, Antonio Pineda-Lucena, Idoia Ochoa, Silve Vicent, Olivier Gevaert, Mikel Hernaez","doi":"10.1038/s42256-025-00987-y","DOIUrl":"10.1038/s42256-025-00987-y","url":null,"abstract":"Drug–target interaction (DTI) prediction is a challenging albeit essential task in drug repurposing. Learning on graph models has drawn special attention as they can substantially reduce drug repurposing costs and time commitment. However, many current approaches require high-demand additional information besides DTIs that complicates their evaluation process and usability. Additionally, structural differences in the learning architecture of current models hinder their fair benchmarking. In this work, we first perform an in-depth evaluation of current DTI datasets and prediction models through a robust benchmarking process and show that DTI methods based on transductive models lack generalization and lead to inflated performance when traditionally evaluated, making them unsuitable for drug repurposing. We then propose a biologically driven strategy for negative-edge subsampling and uncovered previously unknown interactions via in vitro validation, missed by traditional subsampling. Finally, we provide a toolbox from all generated resources, crucial for fair benchmarking and robust model design. The authors address the challenge of predicting drug–target interactions, which is crucial for drug repurposing, by introducing a robust benchmarking framework. Using a biologically driven strategy, they uncover previously unknown interactions.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 3","pages":"495-508"},"PeriodicalIF":18.8,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-025-00987-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143401247","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Benchmarking AI-powered docking methods from the perspective of virtual screening","authors":"Shukai Gu, Chao Shen, Xujun Zhang, Huiyong Sun, Heng Cai, Hao Luo, Huifeng Zhao, Bo Liu, Hongyan Du, Yihao Zhao, Chenggong Fu, Silong Zhai, Yafeng Deng, Huanxiang Liu, Tingjun Hou, Yu Kang","doi":"10.1038/s42256-025-00993-0","DOIUrl":"10.1038/s42256-025-00993-0","url":null,"abstract":"Recently, many artificial intelligence (AI)-powered protein–ligand docking and scoring methods have been developed, demonstrating impressive speed and accuracy. However, these methods often neglected the physical plausibility of the docked complexes and their efficacy in virtual screening (VS) projects. Therefore, we conducted a comprehensive benchmark analysis of four AI-powered and four physics-based docking tools and two AI-enhanced rescoring methods. We initially constructed the TrueDecoy set, a dataset on which the redocking experiments revealed that KarmaDock and CarsiDock surpassed all physics-based tools in docking accuracy, whereas all physics-based tools notably outperformed AI-based methods in structural rationality. The low physical plausibility of docked structures generated by the top AI method, CarsiDock, mainly stems from insufficient intermolecular validity. The VS results on the TrueDecoy set highlight the effectiveness of RTMScore as a rescore function, and Glide-based methods achieved the highest enrichment factors among all docking tools. Furthermore, we created the RandomDecoy set, a dataset that more closely resembles real-world VS scenarios, where AI-based tools obviously outperformed Glide. Additionally, we found that the employed ligand-based postprocessing methods had a weak or even negative impact on optimizing the conformations of docked complexes and enhancing VS performance. Finally, we proposed a hierarchical VS strategy that could efficiently and accurately enrich active molecules in large-scale VS projects. Artificial intelligence (AI)-based docking and scoring methods demonstrate considerable potential for virtual drug screening. Gu et al. go further by assessing the structural rationality of AI-predicted complex conformations from various sources.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 3","pages":"509-520"},"PeriodicalIF":18.8,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143401248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}