Ismael T. Freire, Adrián F. Amil, Paul F. M. J. Verschure
{"title":"Sequential memory improves sample and memory efficiency in episodic control","authors":"Ismael T. Freire, Adrián F. Amil, Paul F. M. J. Verschure","doi":"10.1038/s42256-024-00950-3","DOIUrl":"https://doi.org/10.1038/s42256-024-00950-3","url":null,"abstract":"<p>Deep reinforcement learning algorithms are known for their sample inefficiency, requiring extensive episodes to reach optimal performance. Episodic reinforcement learning algorithms aim to overcome this issue by using extended memory systems to leverage past experiences. However, these memory augmentations are often used as mere buffers, from which isolated events are resampled for offline learning (for example, replay). In this Article, we introduce Sequential Episodic Control (SEC), a hippocampal-inspired model that stores entire event sequences in their temporal order and employs a sequential bias in their retrieval to guide actions. We evaluate SEC across various benchmarks from the Animal-AI testbed, demonstrating its superior performance and sample efficiency compared to several state-of-the-art models, including Model-Free Episodic Control, Deep Q-Network and Episodic Reinforcement Learning with Associative Memory. Our experiments show that SEC achieves higher rewards and faster policy convergence in tasks requiring memory and decision-making. Additionally, we investigate the effects of memory constraints and forgetting mechanisms, revealing that prioritized forgetting enhances both performance and policy stability. Further, ablation studies demonstrate the critical role of the sequential memory component in SEC. Finally, we discuss how fast, sequential hippocampal-like episodic memory systems could support both habit formation and deliberation in artificial and biological systems.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"26 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142904749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Till Richter, Mojtaba Bahrami, Yufan Xia, David S. Fischer, Fabian J. Theis
{"title":"Delineating the effective use of self-supervised learning in single-cell genomics","authors":"Till Richter, Mojtaba Bahrami, Yufan Xia, David S. Fischer, Fabian J. Theis","doi":"10.1038/s42256-024-00934-3","DOIUrl":"https://doi.org/10.1038/s42256-024-00934-3","url":null,"abstract":"<p>Self-supervised learning (SSL) has emerged as a powerful method for extracting meaningful representations from vast, unlabelled datasets, transforming computer vision and natural language processing. In single-cell genomics (SCG), representation learning offers insights into the complex biological data, especially with emerging foundation models. However, identifying scenarios in SCG where SSL outperforms traditional learning methods remains a nuanced challenge. Furthermore, selecting the most effective pretext tasks within the SSL framework for SCG is a critical yet unresolved question. Here we address this gap by adapting and benchmarking SSL methods in SCG, including masked autoencoders with multiple masking strategies and contrastive learning methods. Models trained on over 20 million cells were examined across multiple downstream tasks, including cell-type prediction, gene-expression reconstruction, cross-modality prediction and data integration. Our empirical analyses underscore the nuanced role of SSL, namely, in transfer learning scenarios leveraging auxiliary data or analysing unseen datasets. Masked autoencoders excel over contrastive methods in SCG, diverging from computer vision trends. Moreover, our findings reveal the notable capabilities of SSL in zero-shot settings and its potential in cross-modality prediction and data integration. In summary, we study SSL methods in SCG on fully connected networks and benchmark their utility across key representation learning scenarios.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"64 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142888262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Seeking clarity rather than strong opinions on intelligence","authors":"","doi":"10.1038/s42256-024-00968-7","DOIUrl":"10.1038/s42256-024-00968-7","url":null,"abstract":"Clear descriptions of intelligence in both living organisms and machines are essential to avoid confusion, sharpen thinking and guide interdisciplinary research. A Comment in this issue encourages researchers to answer key questions to improve clarity on the terms they use.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 12","pages":"1408-1408"},"PeriodicalIF":18.8,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-024-00968-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142841571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Strategies needed to counter potential AI misuse","authors":"","doi":"10.1038/s42256-024-00967-8","DOIUrl":"10.1038/s42256-024-00967-8","url":null,"abstract":"Researchers urgently need more guidance to help them identify and mitigate potential risks when designing projects that involve AI developments.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 12","pages":"1407-1407"},"PeriodicalIF":18.8,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-024-00967-8.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142841570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. S. Matthews, M. A. Spence, A. C. Mater, J. Nichols, S. B. Pulsford, M. Sandhu, J. A. Kaczmarski, C. M. Miton, N. Tokuriki, C. J. Jackson
{"title":"Leveraging ancestral sequence reconstruction for protein representation learning","authors":"D. S. Matthews, M. A. Spence, A. C. Mater, J. Nichols, S. B. Pulsford, M. Sandhu, J. A. Kaczmarski, C. M. Miton, N. Tokuriki, C. J. Jackson","doi":"10.1038/s42256-024-00935-2","DOIUrl":"10.1038/s42256-024-00935-2","url":null,"abstract":"Protein language models (PLMs) convert amino acid sequences into the numerical representations required to train machine learning models. Many PLMs are large (>600 million parameters) and trained on a broad span of protein sequence space. However, these models have limitations in terms of predictive accuracy and computational cost. Here we use multiplexed ancestral sequence reconstruction to generate small but focused functional protein sequence datasets for PLM training. Compared to large PLMs, this local ancestral sequence embedding produces representations with higher predictive accuracy. We show that due to the evolutionary nature of the ancestral sequence reconstruction data, local ancestral sequence embedding produces smoother fitness landscapes, in which protein variants that are closer in fitness value become numerically closer in representation space. This work contributes to the implementation of machine learning-based protein design in real-world settings, where data are sparse and computational resources are limited. Matthews et al. present a protein sequence embedding based on data from ancestral sequences that allows machine learning to be used for tasks where training data are scarce or expensive.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 12","pages":"1542-1555"},"PeriodicalIF":18.8,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142841572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reply to: Limitations in odour recognition and generalization in a neuromorphic olfactory circuit","authors":"Roy Moyal, Nabil Imam, Thomas A. Cleland","doi":"10.1038/s42256-024-00951-2","DOIUrl":"10.1038/s42256-024-00951-2","url":null,"abstract":"","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 12","pages":"1454-1456"},"PeriodicalIF":18.8,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142825246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Limitations in odour recognition and generalization in a neuromorphic olfactory circuit","authors":"Nik Dennler, André van Schaik, Michael Schmuker","doi":"10.1038/s42256-024-00952-1","DOIUrl":"10.1038/s42256-024-00952-1","url":null,"abstract":"","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 12","pages":"1451-1453"},"PeriodicalIF":18.8,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142825245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Stable Cox regression for survival analysis under distribution shifts","authors":"Shaohua Fan, Renzhe Xu, Qian Dong, Yue He, Cheng Chang, Peng Cui","doi":"10.1038/s42256-024-00932-5","DOIUrl":"10.1038/s42256-024-00932-5","url":null,"abstract":"Survival analysis aims to estimate the impact of covariates on the expected time until an event occurs, which is broadly utilized in disciplines such as life sciences and healthcare, substantially influencing decision-making and improving survival outcomes. Existing methods, usually assuming similar training and testing distributions, nevertheless face challenges with real-world varying data sources, creating unpredictable shifts that undermine their reliability. This urgently necessitates that survival analysis methods should utilize stable features across diverse cohorts for predictions, rather than relying on spurious correlations. To this end, we propose a stable Cox model with theoretical guarantees to identify stable variables, which jointly optimizes an independence-driven sample reweighting module and a weighted Cox regression model. Through extensive evaluation on simulated and real-world omics and clinical data, stable Cox not only shows strong generalization ability across diverse independent test sets but also stratifies the subtype of patients significantly with the identified biomarker panels. Survival prediction models used in healthcare usually assume that training and test data share a similar distribution, which is not true in real-world settings. Cui and colleagues develop a stable Cox regression model that can identify stable variables for predicting survival outcomes under distribution shifts.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 12","pages":"1525-1541"},"PeriodicalIF":18.8,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-024-00932-5.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142815895","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Julian Büchel, Giacomo Camposampiero, Athanasios Vasilopoulos, Corey Lammie, Manuel Le Gallo, Abbas Rahimi, Abu Sebastian
{"title":"Kernel approximation using analogue in-memory computing","authors":"Julian Büchel, Giacomo Camposampiero, Athanasios Vasilopoulos, Corey Lammie, Manuel Le Gallo, Abbas Rahimi, Abu Sebastian","doi":"10.1038/s42256-024-00943-2","DOIUrl":"10.1038/s42256-024-00943-2","url":null,"abstract":"Kernel functions are vital ingredients of several machine learning (ML) algorithms but often incur substantial memory and computational costs. We introduce an approach to kernel approximation in ML algorithms suitable for mixed-signal analogue in-memory computing (AIMC) architectures. Analogue in-memory kernel approximation addresses the performance bottlenecks of conventional kernel-based methods by executing most operations in approximate kernel methods directly in memory. The IBM HERMES project chip, a state-of-the-art phase-change memory-based AIMC chip, is utilized for the hardware demonstration of kernel approximation. Experimental results show that our method maintains high accuracy, with less than a 1% drop in kernel-based ridge classification benchmarks and within 1% accuracy on the long-range arena benchmark for kernelized attention in transformer neural networks. Compared to traditional digital accelerators, our approach is estimated to deliver superior energy efficiency and lower power consumption. These findings highlight the potential of heterogeneous AIMC architectures to enhance the efficiency and scalability of ML applications. A kernel approximation method that enables linear-complexity attention computation via analogue in-memory computing (AIMC) to deliver superior energy efficiency is demonstrated on a multicore AIMC chip.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 12","pages":"1605-1615"},"PeriodicalIF":18.8,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142815893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Envisioning better benchmarks for machine learning PDE solvers","authors":"Johannes Brandstetter","doi":"10.1038/s42256-024-00962-z","DOIUrl":"https://doi.org/10.1038/s42256-024-00962-z","url":null,"abstract":"Tackling partial differential equations with machine learning solvers is a promising direction, but recent analysis reveals challenges with making fair comparisons to previous methods. Stronger benchmark problems are needed for the field to advance.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"10 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142815847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}