Lars Johannsmeier, Samuel Schneider, Yanan Li, Etienne Burdet, Sami Haddadin
{"title":"A process-centric manipulation taxonomy for the organization, classification and synthesis of tactile robot skills","authors":"Lars Johannsmeier, Samuel Schneider, Yanan Li, Etienne Burdet, Sami Haddadin","doi":"10.1038/s42256-025-01045-3","DOIUrl":"10.1038/s42256-025-01045-3","url":null,"abstract":"Despite decades of research in robotic manipulation, only a few autonomous manipulation skills are currently used. Traditional and machine-learning-based end-to-end solutions have shown substantial progress but still struggle to generate reliable manipulation skills for difficult processes like insertion or bending material. To facilitate the deployment and learning of tactile robot manipulation skills, we introduce here a taxonomy based on formal process specifications provided by experts, which assigns a suitable skill to a given process. We validated the inherent scalability of the taxonomy on 28 different skills from industrial application domains. The experimental results had success rates close to 100%, even under goal pose disturbances, with high performance attained by the skill models in terms of execution times and contact moments in partially known environments. The basic elements of the models are reusable and facilitate skill-learning to optimize control performance. Like established curricula for human trainees, this framework could provide a comprehensive platform that enables robots to acquire relevant manipulation skills and act as a catalyst to propel automation beyond its current capabilities. Despite decades of research, autonomous robotic manipulation skills remain limited, especially for complex tasks such as insertion or bending of materials. Johannsmeier et al. introduce a taxonomy of manipulation skills that synthesizes tactile behaviours from process specifications, achieving high robustness and performance with minimal learning time.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 6","pages":"916-927"},"PeriodicalIF":23.9,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01045-3.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144340813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Janosh Riebesell, Rhys E. A. Goodall, Philipp Benner, Yuan Chiang, Bowen Deng, Gerbrand Ceder, Mark Asta, Alpha A. Lee, Anubhav Jain, Kristin A. Persson
{"title":"A framework to evaluate machine learning crystal stability predictions","authors":"Janosh Riebesell, Rhys E. A. Goodall, Philipp Benner, Yuan Chiang, Bowen Deng, Gerbrand Ceder, Mark Asta, Alpha A. Lee, Anubhav Jain, Kristin A. Persson","doi":"10.1038/s42256-025-01055-1","DOIUrl":"10.1038/s42256-025-01055-1","url":null,"abstract":"The rapid adoption of machine learning in various scientific domains calls for the development of best practices and community agreed-upon benchmarking tasks and metrics. We present Matbench Discovery as an example evaluation framework for machine learning energy models, here applied as pre-filters to first-principles computed data in a high-throughput search for stable inorganic crystals. We address the disconnect between (1) thermodynamic stability and formation energy and (2) retrospective and prospective benchmarking for materials discovery. Alongside this paper, we publish a Python package to aid with future model submissions and a growing online leaderboard with adaptive user-defined weighting of various performance metrics allowing researchers to prioritize the metrics they value most. To answer the question of which machine learning methodology performs best at materials discovery, our initial release includes random forests, graph neural networks, one-shot predictors, iterative Bayesian optimizers and universal interatomic potentials. We highlight a misalignment between commonly used regression metrics and more task-relevant classification metrics for materials discovery. Accurate regressors are susceptible to unexpectedly high false-positive rates if those accurate predictions lie close to the decision boundary at 0 eV per atom above the convex hull. The benchmark results demonstrate that universal interatomic potentials have advanced sufficiently to effectively and cheaply pre-screen thermodynamic stable hypothetical materials in future expansions of high-throughput materials databases. Riebesell et al. introduce Matbench Discovery, a framework to compare machine learning models used to identify stable crystals. Out of several architectures, they find that universal interatomic potentials perform best in the competition.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 6","pages":"836-847"},"PeriodicalIF":23.9,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01055-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144340814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Johannes Kruse, Kasper Lindskow, Michael Riis Andersen, Jes Frellsen
{"title":"Why design choices matter in recommender systems","authors":"Johannes Kruse, Kasper Lindskow, Michael Riis Andersen, Jes Frellsen","doi":"10.1038/s42256-025-01043-5","DOIUrl":"10.1038/s42256-025-01043-5","url":null,"abstract":"In the RecSys ’24 Challenge, participants tackled news recommendations using a large-scale Danish dataset. Although the top-performing models achieved similar accuracy scores, they produced markedly different results on beyond-accuracy metrics, highlighting the need for further research into the normative alignment of recommender systems.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 6","pages":"979-980"},"PeriodicalIF":23.9,"publicationDate":"2025-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144370941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shachar Don-Yehiya, Ben Burtenshaw, Ramon Fernandez Astudillo, Cailean Osborne, Mimansa Jaiswal, Tzu-Sheng Kuo, Wenting Zhao, Idan Shenfeld, Andi Peng, Mikhail Yurochkin, Atoosa Kasirzadeh, Yangsibo Huang, Tatsunori Hashimoto, Yacine Jernite, Daniel Vila-Suero, Omri Abend, Jennifer Ding, Sara Hooker, Hannah Rose Kirk, Leshem Choshen
{"title":"The future of open human feedback","authors":"Shachar Don-Yehiya, Ben Burtenshaw, Ramon Fernandez Astudillo, Cailean Osborne, Mimansa Jaiswal, Tzu-Sheng Kuo, Wenting Zhao, Idan Shenfeld, Andi Peng, Mikhail Yurochkin, Atoosa Kasirzadeh, Yangsibo Huang, Tatsunori Hashimoto, Yacine Jernite, Daniel Vila-Suero, Omri Abend, Jennifer Ding, Sara Hooker, Hannah Rose Kirk, Leshem Choshen","doi":"10.1038/s42256-025-01038-2","DOIUrl":"10.1038/s42256-025-01038-2","url":null,"abstract":"Human feedback on conversations with language models is central to how these systems learn about the world, improve their capabilities and are steered towards desirable and safe behaviours. However, this feedback is mostly collected by frontier artificial intelligence labs and kept behind closed doors. Here we bring together interdisciplinary experts to assess the opportunities and challenges to realizing an open ecosystem of human feedback for artificial intelligence. We first look for successful practices in the peer-production, open-source and citizen-science communities. We then characterize the main challenges for open human feedback. For each, we survey current approaches and offer recommendations. We end by envisioning the components needed to underpin a sustainable and open human feedback ecosystem. In the centre of this ecosystem are mutually beneficial feedback loops, between users and specialized models, incentivizing a diverse stakeholder community of model trainers and feedback providers to support a general open feedback pool. Don-Yehiya et al. explore creating an open ecosystem for human feedback on large language models, drawing from peer-production, open-source and citizen-science practices, and addressing key challenges to establish sustainable feedback loops between users and specialized models.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 6","pages":"825-835"},"PeriodicalIF":23.9,"publicationDate":"2025-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144328872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aude Billard, Alin Albu-Schaeffer, Michael Beetz, Wolfram Burgard, Peter Corke, Matei Ciocarlie, Ravinder Dahiya, Danica Kragic, Ken Goldberg, Yukie Nagai, Davide Scaramuzza
{"title":"A roadmap for AI in robotics","authors":"Aude Billard, Alin Albu-Schaeffer, Michael Beetz, Wolfram Burgard, Peter Corke, Matei Ciocarlie, Ravinder Dahiya, Danica Kragic, Ken Goldberg, Yukie Nagai, Davide Scaramuzza","doi":"10.1038/s42256-025-01050-6","DOIUrl":"10.1038/s42256-025-01050-6","url":null,"abstract":"There is growing excitement about the potential of leveraging artificial intelligence (AI) to tackle some of the outstanding barriers to the full deployment of robots in daily lives. However, action and sensing in the physical world pose greater and different challenges for AI than analysing data in isolation and it is important to reflect on which AI approaches are most likely to be successfully applied to robots. Questions to address, among others, are how AI models can be adapted to specific robot designs, tasks and environments. This Perspective offers an assessment of what AI has achieved for robotics since the 1990s and proposes a research roadmap with challenges and promises. These range from keeping up-to-date large datasets, representatives of a diversity of tasks that robots may have to perform, and of environments they may encounter, to designing AI algorithms tailored specifically to robotics problems but generic enough to apply to a wide range of applications and transfer easily to a variety of robotic platforms. For robots to collaborate effectively with humans, they must predict human behaviour without relying on bias-based profiling. Explainability and transparency in AI-driven robot control are essential for building trust, preventing misuse and attributing responsibility in accidents. We close with describing what are, in our view, primary long-term challenges, namely, designing robots capable of lifelong learning, and guaranteeing safe deployment and usage, as well as sustainable development. AI technologies are advancing rapidly, offering new solutions for autonomous robot operation in complex environments. Aude Billard et al. discuss the need to identify and adapt AI technologies for robotics, proposing a research roadmap to address key challenges and opportunities.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 6","pages":"818-824"},"PeriodicalIF":23.9,"publicationDate":"2025-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144319894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuyang Zhang, Yuhang Liu, Zinnia Ma, Min Li, Chunfu Xu, Haipeng Gong
{"title":"Improving diffusion-based protein backbone generation with global-geometry-aware latent encoding","authors":"Yuyang Zhang, Yuhang Liu, Zinnia Ma, Min Li, Chunfu Xu, Haipeng Gong","doi":"10.1038/s42256-025-01059-x","DOIUrl":"10.1038/s42256-025-01059-x","url":null,"abstract":"The global structural properties of a protein, such as shape, fold and topology, strongly affect its function. Although recent breakthroughs in diffusion-based generative models have greatly advanced de novo protein design, particularly in generating diverse and realistic structures, it remains challenging to design proteins of specific geometries without residue-level control over the topological details. A more practical, top-down approach is needed for prescribing the overall geometric arrangements of secondary structure elements in the generated protein structures. In response, we propose TopoDiff, an unsupervised framework that learns and exploits a global-geometry-aware latent representation, enabling both unconditional and controllable diffusion-based protein generation. Trained on the Protein Data Bank and CATH datasets, the structure encoder embeds protein global geometries into a 32-dimensional latent space, from which latent codes sampled by the latent sampler serve as informative conditions for the diffusion-based backbone decoder. In benchmarks against existing baselines, TopoDiff demonstrates comparable performance on established metrics including designability, diversity and novelty, as well as markedly improves coverage over the fold types of natural proteins in the CATH dataset. Moreover, latent conditioning enables versatile manipulations at the global-geometry level to control the generated protein structures, through which we derived a number of novel folds of mainly beta proteins with comprehensive experimental validation. A variational-autoencoder-based diffusion architecture that enables topological controls on the diffusion-based protein structure generation is proposed. As a result, novel folds of mainly beta proteins can be designed with experimental validation.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 7","pages":"1104-1118"},"PeriodicalIF":23.9,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144311696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yong He, Pan Fang, Yongtao Shan, Yuanfei Pan, Yanhong Wei, Yichang Chen, Yihao Chen, Yi Liu, Zhenyu Zeng, Zhan Zhou, Feng Zhu, Edward C. Holmes, Jieping Ye, Jun Li, Yuelong Shu, Mang Shi, Zhaorong Li
{"title":"Generalized biological foundation model with unified nucleic acid and protein language","authors":"Yong He, Pan Fang, Yongtao Shan, Yuanfei Pan, Yanhong Wei, Yichang Chen, Yihao Chen, Yi Liu, Zhenyu Zeng, Zhan Zhou, Feng Zhu, Edward C. Holmes, Jieping Ye, Jun Li, Yuelong Shu, Mang Shi, Zhaorong Li","doi":"10.1038/s42256-025-01044-4","DOIUrl":"10.1038/s42256-025-01044-4","url":null,"abstract":"The language of biology, encoded in DNA, RNA and proteins, forms the foundation of life but remains challenging to decode owing to its complexity. Traditional computational methods often struggle to integrate information across these molecules, limiting a comprehensive understanding of biological systems. Advances in natural language processing with pre-trained models offer possibilities for interpreting biological language. Here we introduce LucaOne, a pre-trained foundation model trained on nucleic acid and protein sequences from 169,861 species. Through large-scale data integration and semi-supervised learning, LucaOne shows an understanding of key biological principles, such as DNA–protein translation. Using few-shot learning, it effectively comprehends the central dogma of molecular biology and performs competitively on tasks involving DNA, RNA or protein inputs. Our results highlight the potential of unified foundation models to address complex biological questions, providing an adaptable framework for bioinformatics research and enhancing the interpretation of life’s complexity. He and colleagues develop LucaOne, a biological foundation model pre-trained on nucleic acid and protein sequences from 169,861 species. It shows an emerging understanding of molecular biology’s central dogma, enhancing bioinformatics analysis and helping explore unknown aspects of molecular biology.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 6","pages":"942-953"},"PeriodicalIF":23.9,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01044-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144311695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Nikolas Pontikos, William A. Woof, Siying Lin, Biraja Ghoshal, Bernardo S. Mendes, Advaith Veturi, Quang Nguyen, Behnam Javanmardi, Michalis Georgiou, Alexander Hustinx, Miguel A. Ibarra-Arellano, Ismail Moghul, Yichen Liu, Kristina Pfau, Maximilian Pfau, Mital Shah, Jing Yu, Saoud Al-Khuzaei, Siegfried K. Wagner, Malena Daich Varela, Thales Antonio Cabral de Guimarães, Sagnik Sen, Gunjan Naik, Dayyanah Sumodhee, Dun Jack Fu, Nathaniel Kabiri, Jennifer Furman, Bart Liefers, Aaron Y. Lee, Samantha R. De Silva, Caio Marques, Fabiana Motta, Yu Fujinami-Yokokawa, Alison J. Hardcastle, Gavin Arno, Birgit Lorenz, Philipp Herrmann, Kaoru Fujinami, Juliana Sallum, Savita Madhusudhan, Susan M. Downes, Frank G. Holz, Konstantinos Balaskas, Andrew R. Webster, Omar A. Mahroo, Peter M. Krawitz, Michel Michaelides
{"title":"Next-generation phenotyping of inherited retinal diseases from multimodal imaging with Eye2Gene","authors":"Nikolas Pontikos, William A. Woof, Siying Lin, Biraja Ghoshal, Bernardo S. Mendes, Advaith Veturi, Quang Nguyen, Behnam Javanmardi, Michalis Georgiou, Alexander Hustinx, Miguel A. Ibarra-Arellano, Ismail Moghul, Yichen Liu, Kristina Pfau, Maximilian Pfau, Mital Shah, Jing Yu, Saoud Al-Khuzaei, Siegfried K. Wagner, Malena Daich Varela, Thales Antonio Cabral de Guimarães, Sagnik Sen, Gunjan Naik, Dayyanah Sumodhee, Dun Jack Fu, Nathaniel Kabiri, Jennifer Furman, Bart Liefers, Aaron Y. Lee, Samantha R. De Silva, Caio Marques, Fabiana Motta, Yu Fujinami-Yokokawa, Alison J. Hardcastle, Gavin Arno, Birgit Lorenz, Philipp Herrmann, Kaoru Fujinami, Juliana Sallum, Savita Madhusudhan, Susan M. Downes, Frank G. Holz, Konstantinos Balaskas, Andrew R. Webster, Omar A. Mahroo, Peter M. Krawitz, Michel Michaelides","doi":"10.1038/s42256-025-01040-8","DOIUrl":"10.1038/s42256-025-01040-8","url":null,"abstract":"Rare eye diseases such as inherited retinal diseases (IRDs) are challenging to diagnose genetically. IRDs are typically monogenic disorders and represent a leading cause of blindness in children and working-age adults worldwide. A growing number are now being targeted in clinical trials, with approved treatments increasingly available. However, access requires a genetic diagnosis to be established sufficiently early. Critically, the timely identification of a genetic cause remains challenging. We demonstrate that a deep learning algorithm, Eye2Gene, trained on a large multimodal imaging dataset of individuals with IRDs (n = 2,451) and externally validated on data provided by five different clinical centres, provides better-than-expert-level top-five accuracy of 83.9% for supporting genetic diagnosis for the 63 most common genetic causes. We demonstrate that Eye2Gene’s next-generation phenotyping can increase diagnostic yield by improving screening for IRDs, phenotype-driven variant prioritization and automatic similarity matching in phenotypic space to identify new genes. Eye2Gene is accessible online ( app.eye2gene.com ) for research purposes. Eye2Gene’s next-generation phenotyping of multimodal images increases diagnostic yield for inherited retinal diseases by improving screening, phenotype-driven variant prioritization and automatic similarity matching in phenotypic space to drive gene discovery.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 6","pages":"967-978"},"PeriodicalIF":23.9,"publicationDate":"2025-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01040-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144311697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peizhen Bai, Filip Miljković, Xianyuan Liu, Leonardo De Maria, Rebecca Croasdale-Wood, Owen Rackham, Haiping Lu
{"title":"Mask-prior-guided denoising diffusion improves inverse protein folding","authors":"Peizhen Bai, Filip Miljković, Xianyuan Liu, Leonardo De Maria, Rebecca Croasdale-Wood, Owen Rackham, Haiping Lu","doi":"10.1038/s42256-025-01042-6","DOIUrl":"10.1038/s42256-025-01042-6","url":null,"abstract":"Inverse protein folding generates valid amino acid sequences that can fold into a desired protein structure, with recent deep learning advances showing strong potential and competitive performance. However, challenges remain, such as predicting elements with high structural uncertainty, including disordered regions. To tackle such low-confidence residue prediction, we propose a mask-prior-guided denoising diffusion (MapDiff) framework that accurately captures both structural information and residue interactions for inverse protein folding. MapDiff is a discrete diffusion probabilistic model that iteratively generates amino acid sequences with reduced noise, conditioned on a given protein backbone. To incorporate structural information and residue interactions, we have developed a graph-based denoising network with a mask-prior pretraining strategy. Moreover, in the generative process, we combine the denoising diffusion implicit model with Monte-Carlo dropout to reduce uncertainty. Evaluation on four challenging sequence design benchmarks shows that MapDiff substantially outperforms state-of-the-art methods. Furthermore, the in silico sequences generated by MapDiff closely resemble the physico-chemical and structural characteristics of native proteins across different protein families and architectures. Bai and colleagues present MapDiff, a discrete diffusion-based framework for generating amino acid sequences conditioned on a target protein structure, with strong performance in predicting uncertain regions and achieving high in silico foldability.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 6","pages":"876-888"},"PeriodicalIF":23.9,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01042-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144296155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yuang Zhang, Yu Hu, Yunlong Song, Danping Zou, Weiyao Lin
{"title":"Learning vision-based agile flight via differentiable physics","authors":"Yuang Zhang, Yu Hu, Yunlong Song, Danping Zou, Weiyao Lin","doi":"10.1038/s42256-025-01048-0","DOIUrl":"10.1038/s42256-025-01048-0","url":null,"abstract":"Autonomous aerial robot swarms promise transformative applications, from planetary exploration to search and rescue in complex environments. However, navigating these swarms efficiently in unknown and cluttered spaces without bulky sensors, heavy computation or constant communication between robots remains a major research problem. This paper introduces an end-to-end approach that combines deep learning with first-principles physics through differentiable simulation to enable autonomous navigation by several aerial robots through complex environments at high speed. Our approach directly optimizes a neural network control policy by backpropagating loss gradients through the robot simulation using a simple point-mass physics model. Despite this simplicity, our method excels in both multi-agent and single-agent applications. In multi-agent scenarios, our system demonstrates self-organized behaviour, which enables autonomous coordination without communication or centralized planning. In single-agent scenarios, our system achieved a 90% success rate in navigating through complex unknown environments and demonstrated enhanced robustness compared to previous state-of-the-art approaches. Our system can operate without state estimation and adapt to dynamic obstacles. In real-world forest environments, it navigates at speeds of up to 20 m s−1, doubling the speed of previous imitation-learning-based solutions. Notably, all these capabilities are deployed on a budget-friendly US$21 computer, which costs less than 5% of the GPU-equipped board used in existing systems. Zhang et al. present a differentiable-physics simulation approach that enables autonomous aerial robot swarms to navigate complex environments. High-speed navigation and robust performance in both multi-agent and single-agent scenarios are demonstrated with low-cost hardware platforms.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 6","pages":"954-966"},"PeriodicalIF":23.9,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144296156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}