Dhuvarakesh Karthikeyan, Sarah N. Bennett, Amy G. Reynolds, Benjamin G. Vincent, Alex Rubinsteyn
{"title":"Conditional generation of real antigen-specific T cell receptor sequences","authors":"Dhuvarakesh Karthikeyan, Sarah N. Bennett, Amy G. Reynolds, Benjamin G. Vincent, Alex Rubinsteyn","doi":"10.1038/s42256-025-01096-6","DOIUrl":"10.1038/s42256-025-01096-6","url":null,"abstract":"Despite recent advances in T cell receptor (TCR) engineering, designing functional TCRs against arbitrary targets remains challenging due to complex rules governing cross-reactivity and limited paired data. Here we present TCR-TRANSLATE, a sequence-to-sequence framework that adapts low-resource machine translation techniques to generate antigen-specific TCR sequences against unseen epitopes. By evaluating 12 model variants of the BART and T5 model architectures, we identified key factors affecting performance and utility, revealing discordances between these objectives. Our flagship model, TCRT5, outperforms existing approaches on computational benchmarks, prioritizing functionally relevant sequences at higher ranks. Most significantly, we experimentally validated a computationally designed TCR against Wilms’ tumour antigen, a therapeutically relevant target in leukaemia, excluded from our training and validation sets. Although the identified TCR shows cross-reactivity with pathogen-derived peptides, highlighting limitations in specificity, our work represents the successful computational design of a functional TCR construct against a non-viral epitope from the target sequence alone. Our findings establish a foundation for computational TCR design and reveal current limitations in data availability and methodology, providing a framework for accelerating personalized immunotherapy by reducing the search space for novel targets. TCR-TRANSLATE, a deep learning framework adapting machine translation to immune design, demonstrates the successful generation of a functional T cell receptor sequence for a cancer epitope from the target sequence alone.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1494-1509"},"PeriodicalIF":23.9,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01096-6.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145009024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Siyang Jiang, Hao Yang, Qipeng Xie, Chuan Ma, Sen Wang, Zhe Liu, Tao Xiang, Guoliang Xing
{"title":"Towards compute-efficient Byzantine-robust federated learning with fully homomorphic encryption","authors":"Siyang Jiang, Hao Yang, Qipeng Xie, Chuan Ma, Sen Wang, Zhe Liu, Tao Xiang, Guoliang Xing","doi":"10.1038/s42256-025-01107-6","DOIUrl":"https://doi.org/10.1038/s42256-025-01107-6","url":null,"abstract":"<p>In highly regulated domains such as finance and healthcare, where stringent data-sharing constraints pose substantial obstacles, federated learning (FL) has emerged as a transformative paradigm in distributed machine learning, facilitating collaborative model training, preserving data decentralization and upholding governance standards. Despite its advantages, FL is vulnerable to poisoning attacks during central model aggregation, prompting the development of Byzantine-robust FL systems that use robust aggregation rules to counter malicious attacks. However, neural network models in such systems are susceptible to unintentionally memorizing and revealing individual training instances, thereby introducing substantial information leakage risks, as adversaries may exploit this vulnerability to reconstruct sensitive data through model outputs transmitted over the air. Existing solutions fall short of providing a viable Byzantine-robust FL system that is completely secure against information leakage and is computationally efficient. To address these concerns, we propose Lancelot, an efficient and effective Byzantine-robust FL framework that uses fully homomorphic encryption to safeguard against malicious client activities. Lancelot introduces a mask-based encrypted sorting mechanism that overcomes the limitations of multiplication depth in ciphertext sorting with zero information leakage. It incorporates cryptographic enhancements like lazy relinearization, dynamic hoisting and GPU acceleration to ensure practical computational efficiency. Extensive experiments demonstrate that Lancelot surpasses existing approaches, achieving a 20-fold enhancement in processing speed.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"16 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145009023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haoran Sun, Liang He, Pan Deng, Guoqing Liu, Zhiyu Zhao, Yuliang Jiang, Chuan Cao, Fusong Ju, Lijun Wu, Haiguang Liu, Tao Qin, Tie-Yan Liu
{"title":"Accelerating protein engineering with fitness landscape modelling and reinforcement learning","authors":"Haoran Sun, Liang He, Pan Deng, Guoqing Liu, Zhiyu Zhao, Yuliang Jiang, Chuan Cao, Fusong Ju, Lijun Wu, Haiguang Liu, Tao Qin, Tie-Yan Liu","doi":"10.1038/s42256-025-01103-w","DOIUrl":"10.1038/s42256-025-01103-w","url":null,"abstract":"Protein engineering holds substantial promise for designing proteins with customized functions, yet the vast landscape of potential mutations versus limited laboratory capacity constrains the discovery of optimal sequences. Here, to address this, we present the μProtein framework, which accelerates protein engineering by combining μFormer, a deep learning model for accurate mutational effect prediction, with μSearch, a reinforcement learning algorithm designed to efficiently navigate the protein fitness landscape using μFormer as an oracle. μProtein leverages single-mutation data to predict optimal sequences with complex, multi-amino-acid mutations through its modelling of epistatic interactions and a multi-step search strategy. In addition to strong performance on benchmark datasets, μProtein identified high-gain-of-function multi-point mutants for the enzyme β-lactamase, surpassing one of the highest-known activity levels, in wet laboratory, trained solely on single-mutation data. These results demonstrate μProtein’s capability to discover impactful mutations across the vast protein sequence space, offering a robust, efficient approach for protein optimization. μProtein, combining deep learning and reinforcement learning, is developed to design high-function proteins. This framework, trained only on single-mutation data, discovers multi-site β-lactamase mutants with up to 2,000× growth rates.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1446-1460"},"PeriodicalIF":23.9,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145009025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Applying genomic AI to combat antibiotic resistance in low-income countries","authors":"Dickson Aruhomukama","doi":"10.1038/s42256-025-01108-5","DOIUrl":"10.1038/s42256-025-01108-5","url":null,"abstract":"","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1369-1370"},"PeriodicalIF":23.9,"publicationDate":"2025-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144987427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Janosh Riebesell, Rhys E. A. Goodall, Philipp Benner, Yuan Chiang, Bowen Deng, Gerbrand Ceder, Mark Asta, Alpha A. Lee, Anubhav Jain, Kristin A. Persson
{"title":"Author Correction: A framework to evaluate machine learning crystal stability predictions","authors":"Janosh Riebesell, Rhys E. A. Goodall, Philipp Benner, Yuan Chiang, Bowen Deng, Gerbrand Ceder, Mark Asta, Alpha A. Lee, Anubhav Jain, Kristin A. Persson","doi":"10.1038/s42256-025-01117-4","DOIUrl":"10.1038/s42256-025-01117-4","url":null,"abstract":"","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1586-1586"},"PeriodicalIF":23.9,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01117-4.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145129500","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Johannes Y. Lee, Sangjoon Lee, Abhishek Mishra, Xu Yan, Brandon McMahan, Brent Gaisford, Charles Kobashigawa, Mike Qu, Chang Xie, Jonathan C. Kao
{"title":"Brain–computer interface control with artificial intelligence copilots","authors":"Johannes Y. Lee, Sangjoon Lee, Abhishek Mishra, Xu Yan, Brandon McMahan, Brent Gaisford, Charles Kobashigawa, Mike Qu, Chang Xie, Jonathan C. Kao","doi":"10.1038/s42256-025-01090-y","DOIUrl":"10.1038/s42256-025-01090-y","url":null,"abstract":"Motor brain–computer interfaces (BCIs) decode neural signals to help people with paralysis move and communicate. Even with important advances in the past two decades, BCIs face a key obstacle to clinical viability: BCI performance should strongly outweigh costs and risks. To significantly increase the BCI performance, we use shared autonomy, where artificial intelligence (AI) copilots collaborate with BCI users to achieve task goals. We demonstrate this AI-BCI in a non-invasive BCI system decoding electroencephalography signals. We first contribute a hybrid adaptive decoding approach using a convolutional neural network and ReFIT-like Kalman filter, enabling healthy users and a participant with paralysis to control computer cursors and robotic arms via decoded electroencephalography signals. We then design two AI copilots to aid BCI users in a cursor control task and a robotic arm pick-and-place task. We demonstrate AI-BCIs that enable a participant with paralysis to achieve 3.9-times-higher performance in target hit rate during cursor control and control a robotic arm to sequentially move random blocks to random locations, a task they could not do without an AI copilot. As AI copilots improve, BCIs designed with shared autonomy may achieve higher performance. AI copilots are integrated into brain–computer interfaces, enabling a paralysed participant to achieve improved control of computer cursors and robotic arms. This shared autonomy approach offers a promising path to increase BCI performance and clinical viability.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1510-1523"},"PeriodicalIF":23.9,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144928057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Rigorous integration of single-cell ATAC-seq data using regularized barycentric mapping","authors":"Shuchen Zhu, Heyang Hua, Shengquan Chen","doi":"10.1038/s42256-025-01099-3","DOIUrl":"10.1038/s42256-025-01099-3","url":null,"abstract":"Single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) deciphers genome-wide chromatin accessibility, providing profound insights into gene regulation mechanisms. With the rapid advance of sequencing technologies, scATAC-seq data typically encompass numerous samples from various conditions, resulting in complex batch effects, thus necessitating reliable integration tools. While numerous batch integration tools exist for single-cell RNA sequencing data, inherent data characteristic differences limit their effectiveness on scATAC-seq data. Existing integration methods for scATAC-seq data suffer from several fundamental limitations, such as disrupting the biological heterogeneity and focusing solely on low-dimensional correction, which may distort data and hinder downstream analysis. Here we propose Fountain, a deep learning framework for scATAC-seq data integration via rigorous barycentric mapping. Barycentric mapping transforms one data distribution to another in a principled and effective manner through optimal transport. By regularizing barycentric mapping with geometric data information, Fountain achieves accurate batch alignment while preserving biological heterogeneity. Comprehensive experiments across diverse real-world datasets demonstrate the advantages of Fountain over existing methods in batch correction and biological conservation. In addition, the trained Fountain model can integrate data from new batches alongside already integrated data without retraining, enabling continuous online data integration. Moreover, Fountain’s reconstruction strategy generates batch-corrected ATAC profiles, improving the capture of cellular heterogeneity and revealing cell-type-specific implications such as expression enrichment analysis and partitioned heritability analysis. Zhu, Hua and Chen propose Fountain, a deep learning framework for batch integration of scATAC-seq data that utilizes regularized barycentric mapping. It preserves biological heterogeneity, enabling online and original dimensionality integration.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1461-1477"},"PeriodicalIF":23.9,"publicationDate":"2025-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144900548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"LLMs as all-in-one tools to easily generate publication-ready citation diversity reports","authors":"Melissa S. Cantú, Michael R. King","doi":"10.1038/s42256-025-01101-y","DOIUrl":"10.1038/s42256-025-01101-y","url":null,"abstract":"","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1371-1372"},"PeriodicalIF":23.9,"publicationDate":"2025-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144900551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reusability report: Exploring the transferability of self-supervised learning models from single-cell to spatial transcriptomics","authors":"Chuangyi Han, Senlin Lin, Zhikang Wang, Yan Cui, Qi Zou, Zhiyuan Yuan","doi":"10.1038/s42256-025-01097-5","DOIUrl":"10.1038/s42256-025-01097-5","url":null,"abstract":"Self-supervised learning (SSL) has emerged as a powerful approach for learning meaningful representations from large-scale unlabelled datasets in single-cell genomics. Richter et al. evaluated SSL pretext tasks on modelling single-cell RNA sequencing (scRNA-seq) data, demonstrating the effective use of SSL models. However, the transferability of these pretrained SSL models to the spatial transcriptomics domain remains unexplored. Here we assess the performance of three SSL models (random mask, gene programme mask and Barlow Twins) pretrained on scRNA-seq data with spatial transcriptomics datasets, focusing on cell-type prediction and spatial clustering. Our experiments demonstrate that the SSL model with random mask strategy exhibits the best overall performance among evaluated SSL models. Moreover, the models trained from scratch on spatial transcriptomics data outperform the fine-tuned SSL models on cell-type prediction, highlighting a domain gap between scRNA-seq and spatial transcriptomics data whose underlying causes remain an open question. Through expanded analyses of multiple imputation methods and data degradation scenarios, we demonstrate that gene imputation would degrade SSL model performance on cell-type prediction, an effect that is exacerbated by increasing data sparsity. Finally, integrating zero-shot random mask embeddings into chosen spatial clustering methods significantly enhanced their accuracy. Overall, our findings provide valuable insights into the limitations and potential of transferring SSL models to spatial transcriptomics and offer practical guidance for researchers leveraging pretrained models for spatial transcriptomics data analysis. Self-supervised learning models for single-cell RNA sequencing data exhibit poor transferability to spatial transcriptomics for cell-type prediction, although their learned features may enhance spatial analysis.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 9","pages":"1414-1428"},"PeriodicalIF":23.9,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144900442","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards responsible geospatial foundation models","authors":"","doi":"10.1038/s42256-025-01106-7","DOIUrl":"10.1038/s42256-025-01106-7","url":null,"abstract":"Recent years have seen a surge in geospatial artificial intelligence models, with promising applications in ecological and environmental monitoring tasks. Further work should also focus on the sustainable development of such models.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"7 8","pages":"1189-1189"},"PeriodicalIF":23.9,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.comhttps://www.nature.com/articles/s42256-025-01106-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144900431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}