{"title":"A Handle on Mass Coincidence Errors in <i>De Novo</i> Sequencing of Antibodies by Bottom-up Proteomics.","authors":"Douwe Schulte, Joost Snijder","doi":"10.1021/acs.jproteome.4c00188","DOIUrl":null,"url":null,"abstract":"<p><p>Antibody sequences can be determined at 99% accuracy directly from the polypeptide product by using bottom-up proteomics techniques. Sequencing accuracy at the peptide level is limited by the isobaric residues leucine and isoleucine, incomplete fragmentation spectra in which the order of two or more residues remains ambiguous due to lacking fragment ions for the intermediate positions, and isobaric combinations of amino acids, of potentially different lengths, for example, GG = N and GA = Q. Here, we present several updates to Stitch (v1.5), which performs template-based assembly of <i>de novo</i> peptides to reconstruct antibody sequences. This version introduces a mass-based alignment algorithm that explicitly accounts for mass coincidence errors. In addition, it incorporates a postprocessing procedure to assign I/L residues based on secondary fragments (satellite ions, <i>i.e.</i>, <i>w-</i>ions). Moreover, evidence for sequence assignments can now be directly evaluated with the addition of an integrated spectrum viewer. Lastly, input data from a wider selection of <i>de novo</i> peptide sequencing algorithms are allowed, now including Casanovo, PEAKS, Novor.Cloud, pNovo, and MaxNovo, in addition to flat text and FASTA. Combined, these changes make Stitch compatible with a larger range of data processing pipelines and improve its tolerance to peptide-level sequencing errors.</p>","PeriodicalId":48,"journal":{"name":"Journal of Proteome Research","volume":null,"pages":null},"PeriodicalIF":3.8000,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11301774/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Proteome Research","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1021/acs.jproteome.4c00188","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/6/27 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Antibody sequences can be determined at 99% accuracy directly from the polypeptide product by using bottom-up proteomics techniques. Sequencing accuracy at the peptide level is limited by the isobaric residues leucine and isoleucine, incomplete fragmentation spectra in which the order of two or more residues remains ambiguous due to lacking fragment ions for the intermediate positions, and isobaric combinations of amino acids, of potentially different lengths, for example, GG = N and GA = Q. Here, we present several updates to Stitch (v1.5), which performs template-based assembly of de novo peptides to reconstruct antibody sequences. This version introduces a mass-based alignment algorithm that explicitly accounts for mass coincidence errors. In addition, it incorporates a postprocessing procedure to assign I/L residues based on secondary fragments (satellite ions, i.e., w-ions). Moreover, evidence for sequence assignments can now be directly evaluated with the addition of an integrated spectrum viewer. Lastly, input data from a wider selection of de novo peptide sequencing algorithms are allowed, now including Casanovo, PEAKS, Novor.Cloud, pNovo, and MaxNovo, in addition to flat text and FASTA. Combined, these changes make Stitch compatible with a larger range of data processing pipelines and improve its tolerance to peptide-level sequencing errors.
利用自下而上的蛋白质组学技术,可直接从多肽产物中确定抗体序列,准确率高达 99%。肽水平的测序准确性受到以下因素的限制:等位残基亮氨酸和异亮氨酸;不完整的片段谱,其中两个或更多残基的顺序因中间位置缺乏片段离子而模糊不清;氨基酸的等位组合,其长度可能不同,例如 GG = N 和 GA = Q。该版本引入了基于质量的比对算法,明确考虑了质量重合误差。此外,它还加入了一个后处理程序,根据二级碎片(卫星离子,即 w 离子)分配 I/L 残基。此外,现在还增加了综合光谱查看器,可以直接评估序列分配的证据。最后,除了平面文本和 FASTA 之外,现在还允许从更广泛的肽测序算法中输入数据,包括 Casanovo、PEAKS、Novor.Cloud、pNovo 和 MaxNovo。这些变化使 Stitch 能够兼容更多的数据处理管道,并提高了对肽段测序错误的容错能力。
期刊介绍:
Journal of Proteome Research publishes content encompassing all aspects of global protein analysis and function, including the dynamic aspects of genomics, spatio-temporal proteomics, metabonomics and metabolomics, clinical and agricultural proteomics, as well as advances in methodology including bioinformatics. The theme and emphasis is on a multidisciplinary approach to the life sciences through the synergy between the different types of "omics".