clonevdjseq：用于克隆文库中VDJ序列测序、存档和分析的工作流和生物信息学管理系统。

IF 3.3 3区生物学 Q2 BIOCHEMICAL RESEARCH METHODS

BMC Bioinformatics Pub Date : 2025-07-21 DOI:10.1186/s12859-025-06107-2

Keith Mitchell, Samuel Hunter, Lutz Froenicke, Karl Murray, Matthew Settles, James S Trimmer

{"title":"clonevdjseq：用于克隆文库中VDJ序列测序、存档和分析的工作流和生物信息学管理系统。","authors":"Keith Mitchell, Samuel Hunter, Lutz Froenicke, Karl Murray, Matthew Settles, James S Trimmer","doi":"10.1186/s12859-025-06107-2","DOIUrl":null,"url":null,"abstract":"Background: Advances in next-generation sequencing technologies have facilitated extensive analysis of B cell and T cell receptor (BCR/TCR, respectively) sequences from monoclonal hybridoma libraries, single B cells, and single T cells, generating vast amounts of important data pertaining to antigen recognition. However, existing workflows and bioinformatics tools often lack the flexibility and scalability needed to handle large clonal level datasets effectively. An initial system and hybridoma dependent version of this code was distributed as part of the NeuroMabSeq publication, but clonevdjseq aims to be a technical addendum for broader system compatibility and enhanced modeling.Results: We present clonevdjseq, an integrated and accessible software solution leveraging nextflow and Django. Developed primarily for large hybridoma libraries, the workflow and pipeline is amenable to BCR/TCR sequence analysis of homogenous populations or clones of B and T cells, respectively. The clonevdjseq pipeline includes modules for read processing, amplicon denoising, and quality control of paired variable light/heavy chains of BCRs from B cells and hybridomas, or alpha(ɑ)/beta(β) and delta(δ)/gamma(γ) chains of TCRs in the case of T cell applications. The pipeline is built upon a robust, high-throughput library prep protocol, upon which processed data has been verified across thousands of monoclonal antibodies. The results of this effort has yielded sequences used to develop functional recombinant monoclonal antibodies and single chain variable fragments as a part of the NeuroMabSeq initiative where thousands of hybridoma samples were processed (Mitchell et al. in Sci Rep 13(1):16200, 2023) as well as provide additional modeling and extensibility to other modalities. The clonevdjseq software is accessible via Nextflow and also offers a database and web app as a final optional step in the processing for dissemination of results and data exploration.Conclusions: clonevdjseq offers a comprehensive and scalable solution for the processing and analysis of large monoclonal and oligoclonal VDJ datasets. Its modular design, dynamic pipeline, and robust database integration facilitate efficient data management and analysis. The platform is publicly available and aims to support the research community by providing an accessible and flexible tool for archiving and dissemination of BCR sequences from hybridomas, with applicability for other applications such as TCR sequences from single-cell T cell populations.","PeriodicalId":8958,"journal":{"name":"BMC Bioinformatics","volume":"26 1","pages":"186"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12278597/pdf/","citationCount":"0","resultStr":"{\"title\":\"clonevdjseq: A workflow and bioinformatics management system for sequencing, archiving, and analysis of VDJ sequences from clonal libraries.\",\"authors\":\"Keith Mitchell, Samuel Hunter, Lutz Froenicke, Karl Murray, Matthew Settles, James S Trimmer\",\"doi\":\"10.1186/s12859-025-06107-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Background: Advances in next-generation sequencing technologies have facilitated extensive analysis of B cell and T cell receptor (BCR/TCR, respectively) sequences from monoclonal hybridoma libraries, single B cells, and single T cells, generating vast amounts of important data pertaining to antigen recognition. However, existing workflows and bioinformatics tools often lack the flexibility and scalability needed to handle large clonal level datasets effectively. An initial system and hybridoma dependent version of this code was distributed as part of the NeuroMabSeq publication, but clonevdjseq aims to be a technical addendum for broader system compatibility and enhanced modeling.Results: We present clonevdjseq, an integrated and accessible software solution leveraging nextflow and Django. Developed primarily for large hybridoma libraries, the workflow and pipeline is amenable to BCR/TCR sequence analysis of homogenous populations or clones of B and T cells, respectively. The clonevdjseq pipeline includes modules for read processing, amplicon denoising, and quality control of paired variable light/heavy chains of BCRs from B cells and hybridomas, or alpha(ɑ)/beta(β) and delta(δ)/gamma(γ) chains of TCRs in the case of T cell applications. The pipeline is built upon a robust, high-throughput library prep protocol, upon which processed data has been verified across thousands of monoclonal antibodies. The results of this effort has yielded sequences used to develop functional recombinant monoclonal antibodies and single chain variable fragments as a part of the NeuroMabSeq initiative where thousands of hybridoma samples were processed (Mitchell et al. in Sci Rep 13(1):16200, 2023) as well as provide additional modeling and extensibility to other modalities. The clonevdjseq software is accessible via Nextflow and also offers a database and web app as a final optional step in the processing for dissemination of results and data exploration.Conclusions: clonevdjseq offers a comprehensive and scalable solution for the processing and analysis of large monoclonal and oligoclonal VDJ datasets. Its modular design, dynamic pipeline, and robust database integration facilitate efficient data management and analysis. The platform is publicly available and aims to support the research community by providing an accessible and flexible tool for archiving and dissemination of BCR sequences from hybridomas, with applicability for other applications such as TCR sequences from single-cell T cell populations.\",\"PeriodicalId\":8958,\"journal\":{\"name\":\"BMC Bioinformatics\",\"volume\":\"26 1\",\"pages\":\"186\"},\"PeriodicalIF\":3.3000,\"publicationDate\":\"2025-07-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12278597/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Bioinformatics\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://doi.org/10.1186/s12859-025-06107-2\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"BIOCHEMICAL RESEARCH METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Bioinformatics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1186/s12859-025-06107-2","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}

引用次数: 0

摘要

背景：新一代测序技术的进步促进了对来自单克隆杂交瘤文库、单个B细胞和单个T细胞的B细胞和T细胞受体（分别为BCR/TCR）序列的广泛分析，产生了大量与抗原识别有关的重要数据。然而，现有的工作流程和生物信息学工具往往缺乏有效处理大型克隆水平数据集所需的灵活性和可扩展性。该代码的初始系统和杂杂瘤依赖版本作为NeuroMabSeq出版物的一部分分发，但clonevdjseq旨在成为更广泛的系统兼容性和增强建模的技术补充。结果：我们展示了clonevdjseq，一个利用nextflow和Django的集成和可访问的软件解决方案。主要为大型杂交瘤文库开发，工作流程和流水线适用于分别对B细胞和T细胞的同质群体或克隆进行BCR/TCR序列分析。clonevdjseq管道包括用于读取处理、放大子去噪和来自B细胞和杂杂瘤的配对可变bcr轻/重链的质量控制模块，或用于T细胞应用的tcr的α (j)/ β (β)和δ (δ)/ γ (γ)链的质量控制模块。该管道建立在一个强大的，高通量的文库准备协议上，在此基础上处理的数据已经在数千种单克隆抗体中得到验证。这项工作的结果已经产生了用于开发功能性重组单克隆抗体和单链可变片段的序列，作为NeuroMabSeq计划的一部分，该计划处理了数千个杂杂瘤样本（Mitchell等人在Sci Rep 13(1): 16200,2023），并为其他模式提供了额外的建模和可扩展性。clonevdjseq软件可通过Nextflow访问，还提供数据库和web应用程序，作为传播结果和数据探索过程中的最后一个可选步骤。结论：clonevdjseq为大型单克隆和寡克隆VDJ数据集的处理和分析提供了全面和可扩展的解决方案。它的模块化设计、动态管道和健壮的数据库集成促进了高效的数据管理和分析。该平台是公开的，旨在通过提供一个易于访问和灵活的工具来存档和传播来自杂交瘤的BCR序列，并适用于其他应用，如来自单细胞T细胞群体的TCR序列，从而支持研究界。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

clonevdjseq: A workflow and bioinformatics management system for sequencing, archiving, and analysis of VDJ sequences from clonal libraries.

查看原文本刊更多论文

clonevdjseq: A workflow and bioinformatics management system for sequencing, archiving, and analysis of VDJ sequences from clonal libraries.

Background: Advances in next-generation sequencing technologies have facilitated extensive analysis of B cell and T cell receptor (BCR/TCR, respectively) sequences from monoclonal hybridoma libraries, single B cells, and single T cells, generating vast amounts of important data pertaining to antigen recognition. However, existing workflows and bioinformatics tools often lack the flexibility and scalability needed to handle large clonal level datasets effectively. An initial system and hybridoma dependent version of this code was distributed as part of the NeuroMabSeq publication, but clonevdjseq aims to be a technical addendum for broader system compatibility and enhanced modeling.

Results: We present clonevdjseq, an integrated and accessible software solution leveraging nextflow and Django. Developed primarily for large hybridoma libraries, the workflow and pipeline is amenable to BCR/TCR sequence analysis of homogenous populations or clones of B and T cells, respectively. The clonevdjseq pipeline includes modules for read processing, amplicon denoising, and quality control of paired variable light/heavy chains of BCRs from B cells and hybridomas, or alpha(ɑ)/beta(β) and delta(δ)/gamma(γ) chains of TCRs in the case of T cell applications. The pipeline is built upon a robust, high-throughput library prep protocol, upon which processed data has been verified across thousands of monoclonal antibodies. The results of this effort has yielded sequences used to develop functional recombinant monoclonal antibodies and single chain variable fragments as a part of the NeuroMabSeq initiative where thousands of hybridoma samples were processed (Mitchell et al. in Sci Rep 13(1):16200, 2023) as well as provide additional modeling and extensibility to other modalities. The clonevdjseq software is accessible via Nextflow and also offers a database and web app as a final optional step in the processing for dissemination of results and data exploration.

Conclusions: clonevdjseq offers a comprehensive and scalable solution for the processing and analysis of large monoclonal and oligoclonal VDJ datasets. Its modular design, dynamic pipeline, and robust database integration facilitate efficient data management and analysis. The platform is publicly available and aims to support the research community by providing an accessible and flexible tool for archiving and dissemination of BCR sequences from hybridomas, with applicability for other applications such as TCR sequences from single-cell T cell populations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BMC Bioinformatics 生物-生化研究方法

CiteScore

5.70

自引率

3.30%

发文量

506

审稿时长

4.3 months

期刊介绍： BMC Bioinformatics is an open access, peer-reviewed journal that considers articles on all aspects of the development, testing and novel application of computational and statistical methods for the modeling and analysis of all kinds of biological data, as well as other areas of computational biology. BMC Bioinformatics is part of the BMC series which publishes subject-specific journals focused on the needs of individual research communities across all areas of biology and medicine. We offer an efficient, fair and friendly peer review service, and are committed to publishing all sound science, provided that there is some advance in knowledge presented by the work.