The Observed T Cell Receptor Space database enables paired-chain repertoire mining, coherence analysis, and language modeling.

IF 7.5 1区 生物学 Q1 CELL BIOLOGY
Matthew I J Raybould, Alexander Greenshields-Watson, Parth Agarwal, Broncio Aguilar-Sanjuan, Tobias H Olsen, Oliver M Turnbull, Nele P Quast, Charlotte M Deane
{"title":"The Observed T Cell Receptor Space database enables paired-chain repertoire mining, coherence analysis, and language modeling.","authors":"Matthew I J Raybould, Alexander Greenshields-Watson, Parth Agarwal, Broncio Aguilar-Sanjuan, Tobias H Olsen, Oliver M Turnbull, Nele P Quast, Charlotte M Deane","doi":"10.1016/j.celrep.2024.114704","DOIUrl":null,"url":null,"abstract":"<p><p>T cell activation is governed through T cell receptors (TCRs), heterodimers of two sequence-variable chains (often an α and β chain) that synergistically recognize antigen fragments presented on cell surfaces. Despite this, there only exist repositories dedicated to collecting single-chain, not paired-chain, TCR sequence data. We addressed this gap by creating the Observed TCR Space (OTS) database, a source of consistently processed and annotated, full-length, paired-chain TCR sequences. Currently, OTS contains 5.35 million redundant (1.63 million non-redundant), predominantly human sequences from across 50 studies and at least 75 individuals. Using OTS, we identify pairing biases, public TCRs, and distinct chain coherence patterns relative to antibodies. We also release a paired-chain TCR language model, providing paired embedding representations and a method for residue in-filling conditional on the partner chain. OTS will be updated as a central community resource and is freely downloadable and available as a web application.</p>","PeriodicalId":9798,"journal":{"name":"Cell reports","volume":null,"pages":null},"PeriodicalIF":7.5000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cell reports","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1016/j.celrep.2024.114704","RegionNum":1,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CELL BIOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

T cell activation is governed through T cell receptors (TCRs), heterodimers of two sequence-variable chains (often an α and β chain) that synergistically recognize antigen fragments presented on cell surfaces. Despite this, there only exist repositories dedicated to collecting single-chain, not paired-chain, TCR sequence data. We addressed this gap by creating the Observed TCR Space (OTS) database, a source of consistently processed and annotated, full-length, paired-chain TCR sequences. Currently, OTS contains 5.35 million redundant (1.63 million non-redundant), predominantly human sequences from across 50 studies and at least 75 individuals. Using OTS, we identify pairing biases, public TCRs, and distinct chain coherence patterns relative to antibodies. We also release a paired-chain TCR language model, providing paired embedding representations and a method for residue in-filling conditional on the partner chain. OTS will be updated as a central community resource and is freely downloadable and available as a web application.

Abstract Image

观察到的 T 细胞受体空间数据库可进行成对链式的基因库挖掘、一致性分析和语言建模。
T 细胞受体(TCR)是由两条序列可变的链(通常是 α 和 β 链)组成的异质二聚体,可协同识别细胞表面的抗原片段,从而控制 T 细胞的活化。尽管如此,目前只有专门收集单链而非成对链 TCR 序列数据的资料库。为了填补这一空白,我们创建了观察到的 TCR 空间(OTS)数据库,这是一个经过一致处理和注释的全长成对链 TCR 序列库。目前,OTS 包含 535 万条冗余序列(163 万条非冗余序列),主要是来自 50 项研究和至少 75 个个体的人类序列。利用 OTS,我们可以识别配对偏差、公共 TCR 和相对于抗体的独特链一致性模式。我们还发布了一个配对链 TCR 语言模型,提供了配对嵌入表示法和一种根据伙伴链条件进行残基填充的方法。OTS 将作为中心社区资源进行更新,可免费下载,并可作为网络应用程序使用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Cell reports
Cell reports CELL BIOLOGY-
CiteScore
13.80
自引率
1.10%
发文量
1305
审稿时长
77 days
期刊介绍: Cell Reports publishes high-quality research across the life sciences and focuses on new biological insight as its primary criterion for publication. The journal offers three primary article types: Reports, which are shorter single-point articles, research articles, which are longer and provide deeper mechanistic insights, and resources, which highlight significant technical advances or major informational datasets that contribute to biological advances. Reviews covering recent literature in emerging and active fields are also accepted. The Cell Reports Portfolio includes gold open-access journals that cover life, medical, and physical sciences, and its mission is to make cutting-edge research and methodologies available to a wide readership. The journal's professional in-house editors work closely with authors, reviewers, and the scientific advisory board, which consists of current and future leaders in their respective fields. The advisory board guides the scope, content, and quality of the journal, but editorial decisions are independently made by the in-house scientific editors of Cell Reports.
文献相关原料
公司名称 产品信息 采购帮参考价格
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信