Matthew I J Raybould, Alexander Greenshields-Watson, Parth Agarwal, Broncio Aguilar-Sanjuan, Tobias H Olsen, Oliver M Turnbull, Nele P Quast, Charlotte M Deane
{"title":"The Observed T Cell Receptor Space database enables paired-chain repertoire mining, coherence analysis, and language modeling.","authors":"Matthew I J Raybould, Alexander Greenshields-Watson, Parth Agarwal, Broncio Aguilar-Sanjuan, Tobias H Olsen, Oliver M Turnbull, Nele P Quast, Charlotte M Deane","doi":"10.1016/j.celrep.2024.114704","DOIUrl":null,"url":null,"abstract":"<p><p>T cell activation is governed through T cell receptors (TCRs), heterodimers of two sequence-variable chains (often an α and β chain) that synergistically recognize antigen fragments presented on cell surfaces. Despite this, there only exist repositories dedicated to collecting single-chain, not paired-chain, TCR sequence data. We addressed this gap by creating the Observed TCR Space (OTS) database, a source of consistently processed and annotated, full-length, paired-chain TCR sequences. Currently, OTS contains 5.35 million redundant (1.63 million non-redundant), predominantly human sequences from across 50 studies and at least 75 individuals. Using OTS, we identify pairing biases, public TCRs, and distinct chain coherence patterns relative to antibodies. We also release a paired-chain TCR language model, providing paired embedding representations and a method for residue in-filling conditional on the partner chain. OTS will be updated as a central community resource and is freely downloadable and available as a web application.</p>","PeriodicalId":9798,"journal":{"name":"Cell reports","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell reports","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.celrep.2024.114704","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
T cell activation is governed through T cell receptors (TCRs), heterodimers of two sequence-variable chains (often an α and β chain) that synergistically recognize antigen fragments presented on cell surfaces. Despite this, there only exist repositories dedicated to collecting single-chain, not paired-chain, TCR sequence data. We addressed this gap by creating the Observed TCR Space (OTS) database, a source of consistently processed and annotated, full-length, paired-chain TCR sequences. Currently, OTS contains 5.35 million redundant (1.63 million non-redundant), predominantly human sequences from across 50 studies and at least 75 individuals. Using OTS, we identify pairing biases, public TCRs, and distinct chain coherence patterns relative to antibodies. We also release a paired-chain TCR language model, providing paired embedding representations and a method for residue in-filling conditional on the partner chain. OTS will be updated as a central community resource and is freely downloadable and available as a web application.
期刊介绍:
Cell Reports publishes high-quality research across the life sciences and focuses on new biological insight as its primary criterion for publication. The journal offers three primary article types: Reports, which are shorter single-point articles, research articles, which are longer and provide deeper mechanistic insights, and resources, which highlight significant technical advances or major informational datasets that contribute to biological advances. Reviews covering recent literature in emerging and active fields are also accepted.
The Cell Reports Portfolio includes gold open-access journals that cover life, medical, and physical sciences, and its mission is to make cutting-edge research and methodologies available to a wide readership.
The journal's professional in-house editors work closely with authors, reviewers, and the scientific advisory board, which consists of current and future leaders in their respective fields. The advisory board guides the scope, content, and quality of the journal, but editorial decisions are independently made by the in-house scientific editors of Cell Reports.