bioRxiv - Bioinformatics最新文献

筛选
英文 中文
A novel channel invariant architecture for the segmentation of cells and nuclei in multiplexed images using InstanSeg 使用 InstanSeg 在多路复用图像中分割细胞和细胞核的新型通道不变结构
bioRxiv - Bioinformatics Pub Date : 2024-09-08 DOI: 10.1101/2024.09.04.611150
Thibaut Goldsborough, Alan O'Callaghan, Fiona Inglis, Leo Leplat, Andrew Filby, Hakan Bilen, Peter Bankhead
{"title":"A novel channel invariant architecture for the segmentation of cells and nuclei in multiplexed images using InstanSeg","authors":"Thibaut Goldsborough, Alan O'Callaghan, Fiona Inglis, Leo Leplat, Andrew Filby, Hakan Bilen, Peter Bankhead","doi":"10.1101/2024.09.04.611150","DOIUrl":"https://doi.org/10.1101/2024.09.04.611150","url":null,"abstract":"The quantitative analysis of bioimaging data increasingly depends on the accurate segmentation of cells and nuclei, a significant challenge for the analysis of high-plex imaging data. Current deep learning-based approaches to segment cells in multiplexed images require reducing the input to a small and fixed number of input channels, discarding imaging information in the process. We present ChannelNet, a novel deep learning architecture for generating three-channel representations of multiplexed images irrespective of the number or ordering of imaged biomarkers. When combined with InstanSeg, ChannelNet sets a new benchmark for the segmentation of cells and nuclei on public multiplexed imaging datasets. We provide an open implementation of our method and integrate it in open source software. Our code and models are available at https://github.com/instanseg/instanseg.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"60 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142224344","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Projection Statistics ProST Online statistical assessment of group separation in data projection analysis 投影统计 ProST 数据投影分析中分组分离的在线统计评估
bioRxiv - Bioinformatics Pub Date : 2024-09-08 DOI: 10.1101/2024.09.04.611273
Danny Salem, Anuradha Surendra, Graeme SV McDowell, Miroslava Cuperlovic-Culf
{"title":"Projection Statistics ProST Online statistical assessment of group separation in data projection analysis","authors":"Danny Salem, Anuradha Surendra, Graeme SV McDowell, Miroslava Cuperlovic-Culf","doi":"10.1101/2024.09.04.611273","DOIUrl":"https://doi.org/10.1101/2024.09.04.611273","url":null,"abstract":"Motivation: Unsupervised data projection for the determination of trends in the data, visualization of multidimensional data in a reduced dimension space or feature space reduction through combination of data is a major step in data mining. Methods such as Principal Component Analysis or t-Distribution Stochastic Neighbor Embedding are regularly used as one of the first steps in computational biology or omics investigation. However, the significance of the separation of sample groups by these methods generally relies on visual assessment. User-friendly application for different projection methods, each focusing on distinct data properties, are needed as well as a rigorous method for statistical determination of the significance of separation of groups of interest in each dataset.\u0000Results: We present Projection STatistics (ProST), a user-friendly solution for data projection analysis providing three unsupervised (PCA, t-SNE and UMAP) and one supervised (LDA) approach. For each method we are including a novel statistical investigation of the significance of group separation with Mann-Whitney U-rank or t-test analysis as well as necessary preprocessing steps. ProST provides an unbiased, objective application of the determination of the significance of the separation of measurement groups through either linear or manifold projection analysis with methods ranging from a focus on the separation of points based on major variances or on point proximity based on distance.\u0000Availability: The ProST software application is freely available at https://complimet.ca/shiny/ProST/ with source code provided on https://github.com/complimet/prost.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"60 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142185489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting the translation efficiency of messenger RNA in mammalian cells 预测哺乳动物细胞中信使 RNA 的翻译效率
bioRxiv - Bioinformatics Pub Date : 2024-08-11 DOI: 10.1101/2024.08.11.607362
Dinghai Zheng, Jun Wang, Logan Persyn, Yue Liu, Fernando Ulloa Montoya, Can Cenik, Vikram Agarwal
{"title":"Predicting the translation efficiency of messenger RNA in mammalian cells","authors":"Dinghai Zheng, Jun Wang, Logan Persyn, Yue Liu, Fernando Ulloa Montoya, Can Cenik, Vikram Agarwal","doi":"10.1101/2024.08.11.607362","DOIUrl":"https://doi.org/10.1101/2024.08.11.607362","url":null,"abstract":"The degree to which translational control is specified by mRNA sequence is poorly understood in mammalian cells. Here, we constructed and leveraged a compendium of 3,819 ribosomal profiling datasets, distilling them into a transcriptome-wide atlas of translation efficiency (TE) measurements encompassing >140 human and mouse cell types. We subsequently developed RiboNN, a multitask deep convolutional neural network, and classic machine learning models to predict TEs in hundreds of cell types from sequence-encoded mRNA features, achieving state-of-the-art performance (r=0.79 in human and r=0.78 in mouse for mean TE across cell types). While the majority of earlier models solely considered 5′ UTR sequence, RiboNN integrates contributions from the full-length mRNA sequence, learning that the 5′ UTR, CDS, and 3′ UTR respectively possess ~67%, 31%, and 2% per-nucleotide information density in the specification of mammalian TEs. Interpretation of RiboNN revealed that the spatial positioning of low-level di- and tri-nucleotide features (i.e., including codons) largely explain model performance, capturing mechanistic principles such as how ribosomal processivity and tRNA abundance control translational output. RiboNN is predictive of the translational behavior of base-modified therapeutic RNA, and can explain evolutionary selection pressures in human 5′ UTRs. Finally, it detects a common language governing mRNA regulatory control and highlights the interconnectedness of mRNA translation, stability, and localization in mammalian organisms.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141934604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Translation efficiency covariation across cell types is a conserved organizing principle of mammalian transcriptomes 跨细胞类型的翻译效率共变是哺乳动物转录组的一个保守的组织原则
bioRxiv - Bioinformatics Pub Date : 2024-08-11 DOI: 10.1101/2024.08.11.607360
Yue Liu, Ian Hoskins, Michael Geng, Qiuxia Zhao, Jonathan Chacko, Kangsheng Qi, Logan Persyn, Jun Wang, Dinghai Zheng, Yochen Zhong, Shilpa Rao, Dayea Park, Elif Sarinay Cenik, Vikram Agarwal, Hakan Ozadam, Can Cenik
{"title":"Translation efficiency covariation across cell types is a conserved organizing principle of mammalian transcriptomes","authors":"Yue Liu, Ian Hoskins, Michael Geng, Qiuxia Zhao, Jonathan Chacko, Kangsheng Qi, Logan Persyn, Jun Wang, Dinghai Zheng, Yochen Zhong, Shilpa Rao, Dayea Park, Elif Sarinay Cenik, Vikram Agarwal, Hakan Ozadam, Can Cenik","doi":"10.1101/2024.08.11.607360","DOIUrl":"https://doi.org/10.1101/2024.08.11.607360","url":null,"abstract":"Characterization of shared patterns of RNA expression between genes across conditions has led to the discovery of regulatory networks and novel biological functions. However, it is unclear if such coordination extends to translation, a critical step in gene expression. Here, we uniformly analyzed 3,819 ribosome profiling datasets from 117 human and 94 mouse tissues and cell lines. We introduce the concept of Translation Efficiency Covariation (TEC), identifying coordinated translation patterns across cell types. We nominate potential mechanisms driving shared patterns of translation regulation. TEC is conserved across human and mouse cells and helps uncover gene functions. Moreover, our observations indicate that proteins that physically interact are highly enriched for positive covariation at both translational and transcriptional levels. Our findings establish translational covariation as a conserved organizing principle of mammalian transcriptomes.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"119 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141934603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Network-based modelling reveals cell-type enriched patterns of non-coding RNA regulation during human skeletal muscle remodelling 基于网络的建模揭示了人类骨骼肌重塑过程中非编码 RNA 的细胞类型富集调控模式
bioRxiv - Bioinformatics Pub Date : 2024-08-11 DOI: 10.1101/2024.08.11.606848
Jonathan Cesare Mcleod, Changhyun Lim, Tanner Stokes, Jalil-Ahmad Sharif, Vagif Zeynalli, Lucas Wiens, Alysha C D'Souza, Lauren Colenso-Semple, James McKendry, Robert W Morton, Cameron J Mitchell, Sara Y Oikawa, Claes Wahlestedt, Paul Chapple, Chris McGlory, James A Timmons, Stuart M Phillips
{"title":"Network-based modelling reveals cell-type enriched patterns of non-coding RNA regulation during human skeletal muscle remodelling","authors":"Jonathan Cesare Mcleod, Changhyun Lim, Tanner Stokes, Jalil-Ahmad Sharif, Vagif Zeynalli, Lucas Wiens, Alysha C D'Souza, Lauren Colenso-Semple, James McKendry, Robert W Morton, Cameron J Mitchell, Sara Y Oikawa, Claes Wahlestedt, Paul Chapple, Chris McGlory, James A Timmons, Stuart M Phillips","doi":"10.1101/2024.08.11.606848","DOIUrl":"https://doi.org/10.1101/2024.08.11.606848","url":null,"abstract":"Most human genes are non-protein-coding RNA (ncRNA). A handful of ncRNAs have characterised functions, including important epigenetic roles in development and disease. Neither ncRNA nor multinucleated muscle is ideally suited to sequencing technologies. We therefore used customised RNA profiling methods and quantitative network modelling to study cell-type specific ncRNA transcriptome responses during load-induced skeletal muscle hypertrophy. We completed five independent supervised exercise-training studies (n=144) and 61% of individuals accrued muscle mass beyond normal technical variation (lean mass responders, LMR). The remainder were defined as having no measurable lean mass response (NMLMR). Fifty ncRNA genes (FDR <1%) were differentially regulated in LMR, and in total we identified 110 ncRNAs for further study. A network model of the human muscle transcriptome was built (n=437 samples), assigning ncRNAs to protein coding modules representing functional pathways or single-cell types. We identified that the known hypertrophy-related ncRNA, CYTOR, was leukocyte-associated in vivo in humans (FDR = 4.9 x10-7; Fold Enrichment [FE] = 6.6). Other ncRNA modules included PPP1CB-DT, which was segregated with myofibril assembly genes (FDR = 8.15 x 10-8; FE = 47.5), while EEF1A1P24 and TMSB4XP8 were associated with vascular remodelling and angiogenesis genes (FDR = 2.77 x 10-5; FE = 3.6). MYREM was positively associated with hypertrophy, and we established its myonuclear expression pattern in vivo in humans using spatial transcriptomics probes. We show that single-cell type associations of ncRNA are identifiable from bulk transcriptomic data and that hypertrophy-linked ncRNA genes appear to mediate their association with muscle growth via multiple cell types.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141934602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient clustering of large molecular libraries 高效聚类大型分子库
bioRxiv - Bioinformatics Pub Date : 2024-08-10 DOI: 10.1101/2024.08.10.607459
Vicky Jung, Kenneth Lopez Perez, Lexin Chen, Kate Huddleston, Ramon Alain Miranda Quintana
{"title":"Efficient clustering of large molecular libraries","authors":"Vicky Jung, Kenneth Lopez Perez, Lexin Chen, Kate Huddleston, Ramon Alain Miranda Quintana","doi":"10.1101/2024.08.10.607459","DOIUrl":"https://doi.org/10.1101/2024.08.10.607459","url":null,"abstract":"The widespread use of Machine Learning (ML) techniques in chemical applications has come with the pressing need to analyze extremely large molecular libraries. In particular, clustering remains one of the most common tools to dissect the chemical space. Unfortunately, most current approaches present unfavorable time and memory scaling, which makes them unsuitable to handle million- and billion-sized sets. Here, we propose to bypass these problems with a time- and memory-efficient clustering algorithm, BitBIRCH. This method uses a tree structure similar to the one found in the Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH) algorithm to ensure O(N) time scaling. BitBIRCH leverages the instant similarity (iSIM) formalism to process binary fingerprints, allowing the use of Tanimoto similarity, and reducing memory requirements. Our tests show that BitBIRCH is already > 1,000 times faster than standard implementations of the Taylor-Butina clustering for libraries with 1,500,000 molecules. BitBIRCH increases efficiency without compromising the quality of the resulting clusters. We explore strategies to handle large sets, which we applied in the clustering of one billion molecules under 5 hours using a parallel/iterative BitBIRCH approximation.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"40 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141934754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Somatic mutation phasing and haplotype extension using linked-reads in multiple myeloma 利用链接读数对多发性骨髓瘤中的体细胞突变进行分期和单倍型扩展
bioRxiv - Bioinformatics Pub Date : 2024-08-10 DOI: 10.1101/2024.08.09.607342
Steven M Foltz, Yize Li, Lijun Yao, Nadezhda V Terekhanova, Amila Weerasinghe, Qingsong Gao, Guanlan Dong, Moses Schindler, Song Cao, Hua Sun, Reyka G Jayasinghe, Robert S Fulton, Catrina C Fronick, Justin King, Daniel R Kohnen, Mark A Fiala, Ken Chen, John F DiPersio, Ravi Vij, Li Ding
{"title":"Somatic mutation phasing and haplotype extension using linked-reads in multiple myeloma","authors":"Steven M Foltz, Yize Li, Lijun Yao, Nadezhda V Terekhanova, Amila Weerasinghe, Qingsong Gao, Guanlan Dong, Moses Schindler, Song Cao, Hua Sun, Reyka G Jayasinghe, Robert S Fulton, Catrina C Fronick, Justin King, Daniel R Kohnen, Mark A Fiala, Ken Chen, John F DiPersio, Ravi Vij, Li Ding","doi":"10.1101/2024.08.09.607342","DOIUrl":"https://doi.org/10.1101/2024.08.09.607342","url":null,"abstract":"Somatic mutation phasing informs our understanding of cancer-related events, like driver mutations. We generated linked-read whole genome sequencing data for 23 samples across disease stages from 14 multiple myeloma (MM) patients and systematically assigned somatic mutations to haplotypes using linked-reads. Here, we report the reconstructed cancer haplotypes and phase blocks from several MM samples and show how phase block length can be extended by integrating samples from the same individual. We also uncover phasing information in genes frequently mutated in MM, including DIS3, HIST1H1E, KRAS, NRAS, and TP53, phasing 79.4% of 20,705 high-confidence somatic mutations. In some cases, this enabled us to interpret clonal evolution models at higher resolution using pairs of phased somatic mutations. For example, our analysis of one patient suggested that two NRAS hotspot mutations occurred on the same haplotype but were independent events in different subclones. Given sufficient tumor purity and data quality, our framework illustrates how haplotype-aware analysis of somatic mutations in cancer can be beneficial for some cancer cases.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"59 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141934687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unveiling Fine-scale Spatial Structures and Amplifying Gene Expression Signals in Ultra-Large ST slices with HERGAST 利用 HERGAST 揭示超大 ST 切片的精细空间结构并放大基因表达信号
bioRxiv - Bioinformatics Pub Date : 2024-08-10 DOI: 10.1101/2024.08.09.607422
Yuqiao Gong, Xin Yuan, Qiong Jiao, Zhangsheng Yu
{"title":"Unveiling Fine-scale Spatial Structures and Amplifying Gene Expression Signals in Ultra-Large ST slices with HERGAST","authors":"Yuqiao Gong, Xin Yuan, Qiong Jiao, Zhangsheng Yu","doi":"10.1101/2024.08.09.607422","DOIUrl":"https://doi.org/10.1101/2024.08.09.607422","url":null,"abstract":"We propose HERGAST, a system for spatial structure identification and signal amplification in ultra-large-scale and ultra-high-resolution spatial transcriptomics data. To handle ultra-large ST data, we consider the divide and conquer strategy and devise a Divide-Iterate-Conque framework specially for spatial transcriptomics data analysis, which can also be adopted by other computational methods for extending to ultra-large-scale ST data analysis. To tackle the potential oversmoothing problem arising from data splitting, we construct a heterogeneous graph network to incorporate both local and global spatial relationships. In simulation, HERGAST consistently outperformed other methods across all settings with more than 10% average gaining. In real-world data, HERGAST's high-precision spatial clustering enabled finding SPP1+ macrophages intermingled in tumors in colorectal cancer, while the enhanced gene expression signal enabled discovering unique spatial expression pattern of key genes in breast cancer.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141934684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AllerTrans: An Improved Protein Allergenicity Prediction Model Using Deep Learning AllerTrans:利用深度学习改进蛋白质过敏性预测模型
bioRxiv - Bioinformatics Pub Date : 2024-08-10 DOI: 10.1101/2024.08.09.607419
Faezeh Sarlakifar, Hamed Malek, Najaf Allahyari Fard, Zahra Khotanlou
{"title":"AllerTrans: An Improved Protein Allergenicity Prediction Model Using Deep Learning","authors":"Faezeh Sarlakifar, Hamed Malek, Najaf Allahyari Fard, Zahra Khotanlou","doi":"10.1101/2024.08.09.607419","DOIUrl":"https://doi.org/10.1101/2024.08.09.607419","url":null,"abstract":"Recognizing the potential allergenicity of proteins is essential for ensuring their safety. Allergens are a major concern in determining protein safety, especially with the increasing use of recombinant proteins in new medical products. These proteins need careful allergenicity assessment to guarantee their safety. However, traditional laboratory testing for allergenicity is expensive and time-consuming. To address this challenge, bioinformatics offers efficient and cost-effective alternatives for predicting protein allergenicity. In this study, we developed an enhanced deep-learning model to predict the potential allergenicity of proteins based on their primary structure represented as protein sequences. Our approach utilizes two protein language models, to extract distinct feature vectors for each sequence, which are then input into a deep neural network model for classification. Each feature vector represents a specific aspect of the protein sequence, and combining them enhances the final result and balances the model's sensitivity and specificity. The model classifies proteins into allergenic or non-allergenic classes. Our proposed model demonstrates admissible improvement across all evaluation metrics compared to the AlgPred 2.0 model, achieving a sensitivity of 97.91%, specificity of 97.69%, accuracy of 97.80%, and an impressive area under the ROC curve of 99% on the AlgPred 2.0 dataset using standard five-fold cross-validation.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141934682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Comprehensive assembly of monoclonal and mixed antibody sequences 单克隆和混合抗体序列的全面组合
bioRxiv - Bioinformatics Pub Date : 2024-08-10 DOI: 10.1101/2024.08.09.607415
Wenbin Jiang, Yueting Xiong, Jin Xiao, Jingyi Wang, Zhenjian Jiang, Ling Luo, Quan Yuan, Ningshao Xia, Rongshan Yu
{"title":"Comprehensive assembly of monoclonal and mixed antibody sequences","authors":"Wenbin Jiang, Yueting Xiong, Jin Xiao, Jingyi Wang, Zhenjian Jiang, Ling Luo, Quan Yuan, Ningshao Xia, Rongshan Yu","doi":"10.1101/2024.08.09.607415","DOIUrl":"https://doi.org/10.1101/2024.08.09.607415","url":null,"abstract":"The elucidation of antibody sequence information is crucial for understanding antigen binding and advancing therapeutic and research applications. However, complete de novo assembly of monoclonal antibody sequences remains challenging due to accuracy and robustness limitations. To address this issue, we introduce Fusion, an innovative de novo assembler that integrates overlapping peptides and template information into complete sequences using a beam search strategy. We demonstrate Fusion's performance by reconstructing multiple human and murine antibodies with highest accuracy (100% and over 99%, respectively). Biological validation of the recombinantly expressed AFS98 antibody with unknown sequences further supports its effectiveness. Furthermore, current methods are applicable only to traditional monoclonal antibody sequencing assembly, presenting a significant bottleneck in achieving higher throughput. In contrast, Fusion can assemble peptide sequences from mixtures of two or three monoclonal antibodies into complete individual sequences with the same accuracy as traditional sequencing, significantly enhancing throughput. To our knowledge, this is the first study enabling high-throughput sequencing of multiple antibodies using only bottom-up mass spectrometry. The duration, expense, and reagent consumption of mass spectrometry detection are comparable to those required for sequencing a single monoclonal antibody. In summary, Fusion's superior performance in handling the complex antibody sequencing represents a significant advancement in antibody research.","PeriodicalId":501307,"journal":{"name":"bioRxiv - Bioinformatics","volume":"130 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141934759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信