{"title":"Clinical large language models with misplaced focus","authors":"Zining Luo, Haowei Ma, Zhiwu Li, Yuquan Chen, Yixin Sun, Aimin Hu, Jiang Yu, Yang Qiao, Junxian Gu, Hongying Li, Xuxi Peng, Dunrui Wang, Ying Liu, Zhenglong Liu, Jiebin Xie, Zhen Jiang, Gang Tian","doi":"10.1038/s42256-024-00929-0","DOIUrl":"https://doi.org/10.1038/s42256-024-00929-0","url":null,"abstract":"<p>On 12 September 2024, OpenAI released two new large language models (LLMs) — o1-preview and o1-mini — marking an important shift in the competitive landscape of commercial LLMs, particularly concerning their reasoning capabilities. Since the introduction of GPT-3.5, OpenAI has launched 31 LLMs in two years. Researchers are rapidly applying these evolving commercial models in clinical medicine, achieving results that sometimes exceed human performance in specific tasks. Although such success is encouraging, the development of the models used for these tasks may not align with the characteristics and needs of clinical practice.</p><p>LLMs can be categorized as either open-source or closed-source. Open-source models, such as Meta’s Llama, allow developers to access source code, training data and documentation freely. By contrast, closed-source models are accessed only through official channels or application programming interfaces (APIs). Initially, open-source models dominated the LLM landscape, until the release of OpenAI’s GPT-3 in 2020<sup>1</sup>, which attracted considerable commercial interest and shifted focus towards closed-source approaches<sup>2</sup>.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"18 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142670260","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zaixi Zhang, Wan Xiang Shen, Qi Liu, Marinka Zitnik
{"title":"Efficient generation of protein pockets with PocketGen","authors":"Zaixi Zhang, Wan Xiang Shen, Qi Liu, Marinka Zitnik","doi":"10.1038/s42256-024-00920-9","DOIUrl":"https://doi.org/10.1038/s42256-024-00920-9","url":null,"abstract":"<p>Designing protein-binding proteins is critical for drug discovery. However, artificial-intelligence-based design of such proteins is challenging due to the complexity of protein–ligand interactions, the flexibility of ligand molecules and amino acid side chains, and sequence–structure dependencies. We introduce PocketGen, a deep generative model that produces residue sequence and atomic structure of the protein regions in which ligand interactions occur. PocketGen promotes consistency between protein sequence and structure by using a graph transformer for structural encoding and a sequence refinement module based on a protein language model. The graph transformer captures interactions at multiple scales, including atom, residue and ligand levels. For sequence refinement, PocketGen integrates a structural adapter into the protein language model, ensuring that structure-based predictions align with sequence-based predictions. PocketGen can generate high-fidelity protein pockets with enhanced binding affinity and structural validity. It operates ten times faster than physics-based methods and achieves a 97% success rate, defined as the percentage of generated pockets with higher binding affinity than reference pockets. Additionally, it attains an amino acid recovery rate exceeding 63%.</p>","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"21 1","pages":""},"PeriodicalIF":23.8,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142637798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AI reality check","authors":"","doi":"10.1038/s42256-023-00755-w","DOIUrl":"10.1038/s42256-023-00755-w","url":null,"abstract":"AI-generated media are on the rise and are here to stay. Regulation is urgently needed, but in the meantime creators, users and content distributors need to pursue various ways, and adopt various tools, for responsible generation, sharing and detection of AI-generated content.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"5 10","pages":"1055-1055"},"PeriodicalIF":23.8,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-023-00755-w.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49697371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gefei Wang, Jia Zhao, Yan Yan, Yang Wang, Angela Ruohao Wu, Can Yang
{"title":"Construction of a 3D whole organism spatial atlas by joint modelling of multiple slices with deep neural networks","authors":"Gefei Wang, Jia Zhao, Yan Yan, Yang Wang, Angela Ruohao Wu, Can Yang","doi":"10.1038/s42256-023-00734-1","DOIUrl":"10.1038/s42256-023-00734-1","url":null,"abstract":"Spatial transcriptomics (ST) technologies are revolutionizing the way to explore the spatial architecture of tissues. Currently, ST data analysis is often restricted to a single two-dimensional (2D) tissue slice, limiting our capacity to understand biological processes that take place in 3D space. Here we present STitch3D, a unified framework that integrates multiple ST slices to reconstruct 3D cellular structures. By jointly modelling multiple slices and integrating them with single-cell RNA-sequencing data, STitch3D simultaneously identifies 3D spatial regions with coherent gene-expression levels and reveals 3D cell-type distributions. STitch3D distinguishes biological variation among slices from batch effects, and effectively borrows information across slices to assemble powerful 3D models. Through comprehensive experiments, we demonstrate STitch3D’s performance in building comprehensive 3D architectures, which allow 3D analysis in the entire tissue region or even the whole organism. The outputs of STitch3D can be used for multiple downstream tasks, enabling a comprehensive understanding of biological systems. Computational methods for analysing single 2D tissue slices from spatial transcriptomics studies are well established, but their extension to the 3D domain is challenging. Wang et al. develop a deep learning framework that can perform 3D reconstruction of cellular structures in tissues as well as whole organisms.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"5 11","pages":"1200-1213"},"PeriodicalIF":23.8,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49697374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fang Wang, Fan Yang, Longkai Huang, Wei Li, Jiangning Song, Robin B. Gasser, Ruedi Aebersold, Guohua Wang, Jianhua Yao
{"title":"Deep domain adversarial neural network for the deconvolution of cell type mixtures in tissue proteome profiling","authors":"Fang Wang, Fan Yang, Longkai Huang, Wei Li, Jiangning Song, Robin B. Gasser, Ruedi Aebersold, Guohua Wang, Jianhua Yao","doi":"10.1038/s42256-023-00737-y","DOIUrl":"10.1038/s42256-023-00737-y","url":null,"abstract":"Cell type deconvolution is a computational method for the determination/resolution of cell type proportions from bulk sequencing data, and is frequently used for the analysis of divergent cell types in tumour tissue samples. However, deconvolution technology is still in its infancy for the analysis of cell types using proteomic data due to challenges with repeatability/reproducibility, variable reference standards and the lack of single-cell proteomic reference data. Here we develop a deep-learning-based deconvolution method (scpDeconv) specifically designed for proteomic data. scpDeconv uses an autoencoder to leverage the information from bulk proteomic data to improve the quality of single-cell proteomic data, and employs a domain adversarial architecture to bridge the single-cell and bulk data distributions and transfer labels from single-cell data to bulk data. Extensive experiments validate the performance of scpDeconv in the deconvolution of proteomic data produced from various species/sources and different proteomic technologies. This method should find broad applicability to areas including tumour microenvironment interpretation and clinical diagnosis/classification. Deconvolution of cell types in tissue proteomic data is a challenging computational task for the bioinformatics community. A deep-learning method termed scpDeconv is introduced that makes efficient use of single-cell proteomics data to deconvolve cell types and states from bulk proteomics measurements.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"5 11","pages":"1236-1249"},"PeriodicalIF":23.8,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49697409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Protein–protein contact prediction by geometric triangle-aware protein language models","authors":"Peicong Lin, Huanyu Tao, Hao Li, Sheng-You Huang","doi":"10.1038/s42256-023-00741-2","DOIUrl":"10.1038/s42256-023-00741-2","url":null,"abstract":"Information regarding the residue–residue distance between interacting proteins is important for modelling the structures of protein complexes, as well as being valuable for understanding the molecular mechanism of protein–protein interactions. With the advent of deep learning, many methods have been developed to accurately predict the intra-protein residue–residue contacts of monomers. However, it is still challenging to accurately predict inter-protein residue–residue contacts for protein complexes, especially hetero-protein complexes. Here we develop a protein language model-based deep learning method to predict the inter-protein residue–residue contacts of protein complexes—named DeepInter—by introducing a triangle-aware mechanism of triangle update and triangle self-attention into the deep neural network. We extensively validate DeepInter on diverse test sets of 300 homodimeric, 28 CASP-CAPRI homodimeric and 99 heterodimeric complexes and compare it with state-of-the-art methods including CDPred, DeepHomo2.0, GLINTER and DeepHomo. The results demonstrate the accuracy and robustness of DeepInter. Contact prediction between two proteins is still computationally challenging, but is vital for understanding multi-protein complexes. Lin et al. use a geometric deep learning approach to provide accurate predictions of inter-protein residue–residue contacts.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"5 11","pages":"1275-1284"},"PeriodicalIF":23.8,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49697375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mitigating the missing-fragmentation problem in de novo peptide sequencing with a two-stage graph-based deep learning model","authors":"Zeping Mao, Ruixue Zhang, Lei Xin, Ming Li","doi":"10.1038/s42256-023-00738-x","DOIUrl":"10.1038/s42256-023-00738-x","url":null,"abstract":"Novel protein discovery and immunopeptidomics depend on highly sensitive de novo peptide sequencing with tandem mass spectrometry. Despite notable improvement using deep learning models, the missing-fragmentation problem remains an important hurdle that severely degrades the performance of de novo peptide sequencing. Here we reveal that in the process of peptide prediction, missing fragmentation results in the generation of incorrect amino acids within those regions and causes error accumulation thereafter. To tackle this problem, we propose GraphNovo, a two-stage de novo peptide-sequencing algorithm based on a graph neural network. GraphNovo focuses on finding the optimal path in the first stage to guide the sequence prediction in the second stage. Our experiments demonstrate that GraphNovo mitigates the effects of missing fragmentation and outperforms the state-of-the-art de novo peptide-sequencing algorithms. Identifying unknown peptides in tandem mass spectrometry is challenging as fragmentation of precursor peptides can be incomplete. Mao and colleagues present a method based on graph neural networks and a path-searching model to create more stable sequence predictions.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"5 11","pages":"1250-1260"},"PeriodicalIF":23.8,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49697373","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dieuwke Hupkes, Mario Giulianelli, Verna Dankers, Mikel Artetxe, Yanai Elazar, Tiago Pimentel, Christos Christodoulopoulos, Karim Lasri, Naomi Saphra, Arabella Sinclair, Dennis Ulmer, Florian Schottmann, Khuyagbaatar Batsuren, Kaiser Sun, Koustuv Sinha, Leila Khalatbari, Maria Ryskina, Rita Frieske, Ryan Cotterell, Zhijing Jin
{"title":"A taxonomy and review of generalization research in NLP","authors":"Dieuwke Hupkes, Mario Giulianelli, Verna Dankers, Mikel Artetxe, Yanai Elazar, Tiago Pimentel, Christos Christodoulopoulos, Karim Lasri, Naomi Saphra, Arabella Sinclair, Dennis Ulmer, Florian Schottmann, Khuyagbaatar Batsuren, Kaiser Sun, Koustuv Sinha, Leila Khalatbari, Maria Ryskina, Rita Frieske, Ryan Cotterell, Zhijing Jin","doi":"10.1038/s42256-023-00729-y","DOIUrl":"10.1038/s42256-023-00729-y","url":null,"abstract":"The ability to generalize well is one of the primary desiderata for models of natural language processing (NLP), but what ‘good generalization’ entails and how it should be evaluated is not well understood. In this Analysis we present a taxonomy for characterizing and understanding generalization research in NLP. The proposed taxonomy is based on an extensive literature review and contains five axes along which generalization studies can differ: their main motivation, the type of generalization they aim to solve, the type of data shift they consider, the source by which this data shift originated, and the locus of the shift within the NLP modelling pipeline. We use our taxonomy to classify over 700 experiments, and we use the results to present an in-depth analysis that maps out the current state of generalization research in NLP and make recommendations for which areas deserve attention in the future. With the rapid development of natural language processing (NLP) models in the last decade came the realization that high performance levels on test sets do not imply that a model robustly generalizes to a wide range of scenarios. Hupkes et al. review generalization approaches in the NLP literature and propose a taxonomy based on five axes to analyse such studies: motivation, type of generalization, type of data shift, the source of this data shift, and the locus of the shift within the modelling pipeline.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"5 10","pages":"1161-1174"},"PeriodicalIF":23.8,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-023-00729-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49697370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fabio Petroni, Samuel Broscheit, Aleksandra Piktus, Patrick Lewis, Gautier Izacard, Lucas Hosseini, Jane Dwivedi-Yu, Maria Lomeli, Timo Schick, Michele Bevilacqua, Pierre-Emmanuel Mazaré, Armand Joulin, Edouard Grave, Sebastian Riedel
{"title":"Improving Wikipedia verifiability with AI","authors":"Fabio Petroni, Samuel Broscheit, Aleksandra Piktus, Patrick Lewis, Gautier Izacard, Lucas Hosseini, Jane Dwivedi-Yu, Maria Lomeli, Timo Schick, Michele Bevilacqua, Pierre-Emmanuel Mazaré, Armand Joulin, Edouard Grave, Sebastian Riedel","doi":"10.1038/s42256-023-00726-1","DOIUrl":"10.1038/s42256-023-00726-1","url":null,"abstract":"Verifiability is a core content policy of Wikipedia: claims need to be backed by citations. Maintaining and improving the quality of Wikipedia references is an important challenge and there is a pressing need for better tools to assist humans in this effort. We show that the process of improving references can be tackled with the help of artificial intelligence (AI) powered by an information retrieval system and a language model. This neural-network-based system, which we call SIDE, can identify Wikipedia citations that are unlikely to support their claims, and subsequently recommend better ones from the web. We train this model on existing Wikipedia references, therefore learning from the contributions and combined wisdom of thousands of Wikipedia editors. Using crowdsourcing, we observe that for the top 10% most likely citations to be tagged as unverifiable by our system, humans prefer our system’s suggested alternatives compared with the originally cited reference 70% of the time. To validate the applicability of our system, we built a demo to engage with the English-speaking Wikipedia community and find that SIDE’s first citation recommendation is preferred twice as often as the existing Wikipedia citation for the same top 10% most likely unverifiable claims according to SIDE. Our results indicate that an AI-based system could be used, in tandem with humans, to improve the verifiability of Wikipedia. The immense amount of Wikipedia articles makes it challenging for volunteers to ensure that cited sources support the claim they are attached to. Petroni et al. use an information-retrieval model to assist Wikipedia users in improving verifiability.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"5 10","pages":"1142-1148"},"PeriodicalIF":23.8,"publicationDate":"2023-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.nature.com/articles/s42256-023-00726-1.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49697372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Listening in to perceived speech with contrastive learning","authors":"Sergey D. Stavisky, Maitreyee Wairagkar","doi":"10.1038/s42256-023-00742-1","DOIUrl":"10.1038/s42256-023-00742-1","url":null,"abstract":"New algorithms allow researchers to decode words the brain is hearing with a non-invasive method, outside the scalp.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"5 11","pages":"1179-1180"},"PeriodicalIF":23.8,"publicationDate":"2023-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49697410","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}