第五章。蛋白质推断与分组

A. Jones
{"title":"第五章。蛋白质推断与分组","authors":"A. Jones","doi":"10.1039/9781782626732-00093","DOIUrl":null,"url":null,"abstract":"A key process in many proteomics workflows is the identification of proteins, following analysis of tandem MS (MS/MS) spectra, for example by a database search. The core unit of identification from a database search is the identification of peptides, yet most researchers wish to know which proteins have been confidently identified in their samples. As such, following peptide identification, a second stage of data analysis is performed, either internally in the search engine or in a second package, called protein inference. Protein inference is challenging in the common case that proteins have been digested into peptides early in the proteomics workflow, and thus there is no direct link between a peptide and its parent protein. Many peptides could theoretically have been derived from more than one protein in the database searched, and thus it is not straightforward to determine which is the correct assignment. A variety of algorithms and implementations have been developed, which are reviewed in this chapter. Most approaches now report “protein groups” as a the core unit of identification from protein inference, since it is common for more than one database protein to share the same-set of evidence, and thus be indistinguishable. The chapter also describes scoring and statistical values that can be assigned during the protein identification process, to give confidence in the resulting values.","PeriodicalId":192946,"journal":{"name":"Proteome Informatics","volume":"39 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Chapter 5. Protein Inference and Grouping\",\"authors\":\"A. Jones\",\"doi\":\"10.1039/9781782626732-00093\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A key process in many proteomics workflows is the identification of proteins, following analysis of tandem MS (MS/MS) spectra, for example by a database search. The core unit of identification from a database search is the identification of peptides, yet most researchers wish to know which proteins have been confidently identified in their samples. As such, following peptide identification, a second stage of data analysis is performed, either internally in the search engine or in a second package, called protein inference. Protein inference is challenging in the common case that proteins have been digested into peptides early in the proteomics workflow, and thus there is no direct link between a peptide and its parent protein. Many peptides could theoretically have been derived from more than one protein in the database searched, and thus it is not straightforward to determine which is the correct assignment. A variety of algorithms and implementations have been developed, which are reviewed in this chapter. Most approaches now report “protein groups” as a the core unit of identification from protein inference, since it is common for more than one database protein to share the same-set of evidence, and thus be indistinguishable. The chapter also describes scoring and statistical values that can be assigned during the protein identification process, to give confidence in the resulting values.\",\"PeriodicalId\":192946,\"journal\":{\"name\":\"Proteome Informatics\",\"volume\":\"39 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proteome Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1039/9781782626732-00093\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proteome Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1039/9781782626732-00093","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

许多蛋白质组学工作流程中的一个关键过程是在串联质谱(MS/MS)谱分析之后识别蛋白质,例如通过数据库搜索。从数据库搜索中识别的核心单元是肽的识别,然而大多数研究人员希望知道哪些蛋白质在他们的样品中被自信地识别出来。因此,在肽识别之后,在搜索引擎内部或在称为蛋白质推断的第二个包中执行第二阶段的数据分析。在蛋白质组学工作流程的早期,蛋白质已经被消化成肽,因此肽与其亲本蛋白之间没有直接联系,因此蛋白质推断是具有挑战性的。从理论上讲,许多多肽可能来源于数据库中搜索的不止一种蛋白质,因此确定哪一种是正确的分配并不简单。已经开发了各种算法和实现,本章将对其进行回顾。现在,大多数方法都将“蛋白质组”作为蛋白质推断识别的核心单位,因为一个以上的数据库蛋白质共享同一组证据是很常见的,因此无法区分。本章还描述了在蛋白质鉴定过程中可以分配的评分和统计值,以给予结果值的信心。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Chapter 5. Protein Inference and Grouping
A key process in many proteomics workflows is the identification of proteins, following analysis of tandem MS (MS/MS) spectra, for example by a database search. The core unit of identification from a database search is the identification of peptides, yet most researchers wish to know which proteins have been confidently identified in their samples. As such, following peptide identification, a second stage of data analysis is performed, either internally in the search engine or in a second package, called protein inference. Protein inference is challenging in the common case that proteins have been digested into peptides early in the proteomics workflow, and thus there is no direct link between a peptide and its parent protein. Many peptides could theoretically have been derived from more than one protein in the database searched, and thus it is not straightforward to determine which is the correct assignment. A variety of algorithms and implementations have been developed, which are reviewed in this chapter. Most approaches now report “protein groups” as a the core unit of identification from protein inference, since it is common for more than one database protein to share the same-set of evidence, and thus be indistinguishable. The chapter also describes scoring and statistical values that can be assigned during the protein identification process, to give confidence in the resulting values.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信