P2P蛋白质组学——用于增强蛋白质鉴定的数据共享。

Marco Schorlemmer, Joaquín Abián, Carles Sierra, David de la Cruz, Lorenzo Bernacchioni, Enric Jaén, Adrian Perreau de Pinninck, Manuel Atencia
{"title":"P2P蛋白质组学——用于增强蛋白质鉴定的数据共享。","authors":"Marco Schorlemmer,&nbsp;Joaquín Abián,&nbsp;Carles Sierra,&nbsp;David de la Cruz,&nbsp;Lorenzo Bernacchioni,&nbsp;Enric Jaén,&nbsp;Adrian Perreau de Pinninck,&nbsp;Manuel Atencia","doi":"10.1186/1759-4499-4-1","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>In order to tackle the important and challenging problem in proteomics of identifying known and new protein sequences using high-throughput methods, we propose a data-sharing platform that uses fully distributed P2P technologies to share specifications of peer-interaction protocols and service components. By using such a platform, information to be searched is no longer centralised in a few repositories but gathered from experiments in peer proteomics laboratories, which can subsequently be searched by fellow researchers.</p><p><strong>Methods: </strong>The system distributively runs a data-sharing protocol specified in the Lightweight Communication Calculus underlying the system through which researchers interact via message passing. For this, researchers interact with the system through particular components that link to database querying systems based on BLAST and/or OMSSA and GUI-based visualisation environments. We have tested the proposed platform with data drawn from preexisting MS/MS data reservoirs from the 2006 ABRF (Association of Biomolecular Resource Facilities) test sample, which was extensively tested during the ABRF Proteomics Standards Research Group 2006 worldwide survey. In particular we have taken the data available from a subset of proteomics laboratories of Spain's National Institute for Proteomics, ProteoRed, a network for the coordination, integration and development of the Spanish proteomics facilities.</p><p><strong>Results and discussion: </strong>We performed queries against nine databases including seven ProteoRed proteomics laboratories, the NCBI Swiss-Prot database and the local database of the CSIC/UAB Proteomics Laboratory. A detailed analysis of the results indicated the presence of a protein that was supported by other NCBI matches and highly scored matches in several proteomics labs. The analysis clearly indicated that the protein was a relatively high concentrated contaminant that could be present in the ABRF sample. This fact is evident from the information that could be derived from the proposed P2P proteomics system, however it is not straightforward to arrive to the same conclusion by conventional means as it is difficult to discard organic contamination of samples. The actual presence of this contaminant was only stated after the ABRF study of all the identifications reported by the laboratories.</p>","PeriodicalId":88390,"journal":{"name":"Automated experimentation","volume":"4 1","pages":"1"},"PeriodicalIF":0.0000,"publicationDate":"2012-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1759-4499-4-1","citationCount":"2","resultStr":"{\"title\":\"P2P proteomics -- data sharing for enhanced protein identification.\",\"authors\":\"Marco Schorlemmer,&nbsp;Joaquín Abián,&nbsp;Carles Sierra,&nbsp;David de la Cruz,&nbsp;Lorenzo Bernacchioni,&nbsp;Enric Jaén,&nbsp;Adrian Perreau de Pinninck,&nbsp;Manuel Atencia\",\"doi\":\"10.1186/1759-4499-4-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>In order to tackle the important and challenging problem in proteomics of identifying known and new protein sequences using high-throughput methods, we propose a data-sharing platform that uses fully distributed P2P technologies to share specifications of peer-interaction protocols and service components. By using such a platform, information to be searched is no longer centralised in a few repositories but gathered from experiments in peer proteomics laboratories, which can subsequently be searched by fellow researchers.</p><p><strong>Methods: </strong>The system distributively runs a data-sharing protocol specified in the Lightweight Communication Calculus underlying the system through which researchers interact via message passing. For this, researchers interact with the system through particular components that link to database querying systems based on BLAST and/or OMSSA and GUI-based visualisation environments. We have tested the proposed platform with data drawn from preexisting MS/MS data reservoirs from the 2006 ABRF (Association of Biomolecular Resource Facilities) test sample, which was extensively tested during the ABRF Proteomics Standards Research Group 2006 worldwide survey. In particular we have taken the data available from a subset of proteomics laboratories of Spain's National Institute for Proteomics, ProteoRed, a network for the coordination, integration and development of the Spanish proteomics facilities.</p><p><strong>Results and discussion: </strong>We performed queries against nine databases including seven ProteoRed proteomics laboratories, the NCBI Swiss-Prot database and the local database of the CSIC/UAB Proteomics Laboratory. A detailed analysis of the results indicated the presence of a protein that was supported by other NCBI matches and highly scored matches in several proteomics labs. The analysis clearly indicated that the protein was a relatively high concentrated contaminant that could be present in the ABRF sample. This fact is evident from the information that could be derived from the proposed P2P proteomics system, however it is not straightforward to arrive to the same conclusion by conventional means as it is difficult to discard organic contamination of samples. The actual presence of this contaminant was only stated after the ABRF study of all the identifications reported by the laboratories.</p>\",\"PeriodicalId\":88390,\"journal\":{\"name\":\"Automated experimentation\",\"volume\":\"4 1\",\"pages\":\"1\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-01-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1186/1759-4499-4-1\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Automated experimentation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/1759-4499-4-1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automated experimentation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/1759-4499-4-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

背景:为了解决蛋白质组学中使用高通量方法识别已知和新的蛋白质序列的重要和具有挑战性的问题,我们提出了一个数据共享平台,该平台使用全分布式P2P技术来共享对等交互协议和服务组件的规范。通过使用这样的平台,需要搜索的信息不再集中在几个存储库中,而是从同行蛋白质组学实验室的实验中收集,随后可以由其他研究人员进行搜索。方法:系统分布式运行轻量级通信演算中指定的数据共享协议,该协议是系统的基础,研究人员通过该协议通过消息传递进行交互。为此,研究人员通过链接到基于BLAST和/或OMSSA和基于gui的可视化环境的数据库查询系统的特定组件与系统进行交互。我们使用来自2006年ABRF(生物分子资源设施协会)测试样本的MS/MS数据库中的数据对提议的平台进行了测试,该样本在ABRF蛋白质组学标准研究小组2006年全球调查期间进行了广泛测试。特别是,我们从西班牙国家蛋白质组学研究所ProteoRed的蛋白质组学实验室的一个子集中获取了可用的数据,ProteoRed是一个协调、整合和发展西班牙蛋白质组学设施的网络。结果和讨论:我们对9个数据库进行了查询,包括7个proteorede蛋白质组学实验室、NCBI Swiss-Prot数据库和CSIC/UAB蛋白质组学实验室的本地数据库。对结果的详细分析表明,存在一种蛋白质,该蛋白质得到了其他NCBI匹配的支持,并在几个蛋白质组学实验室中获得了高分匹配。分析清楚地表明,蛋白质是一种相对高浓度的污染物,可能存在于ABRF样品中。从所提出的P2P蛋白质组学系统中可以获得的信息可以看出这一事实,然而,通过传统方法得出相同的结论并不简单,因为很难丢弃样品的有机污染。这种污染物的实际存在只有在实验室报告的所有鉴定的ABRF研究后才被陈述。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

P2P proteomics -- data sharing for enhanced protein identification.

P2P proteomics -- data sharing for enhanced protein identification.

P2P proteomics -- data sharing for enhanced protein identification.

P2P proteomics -- data sharing for enhanced protein identification.

Background: In order to tackle the important and challenging problem in proteomics of identifying known and new protein sequences using high-throughput methods, we propose a data-sharing platform that uses fully distributed P2P technologies to share specifications of peer-interaction protocols and service components. By using such a platform, information to be searched is no longer centralised in a few repositories but gathered from experiments in peer proteomics laboratories, which can subsequently be searched by fellow researchers.

Methods: The system distributively runs a data-sharing protocol specified in the Lightweight Communication Calculus underlying the system through which researchers interact via message passing. For this, researchers interact with the system through particular components that link to database querying systems based on BLAST and/or OMSSA and GUI-based visualisation environments. We have tested the proposed platform with data drawn from preexisting MS/MS data reservoirs from the 2006 ABRF (Association of Biomolecular Resource Facilities) test sample, which was extensively tested during the ABRF Proteomics Standards Research Group 2006 worldwide survey. In particular we have taken the data available from a subset of proteomics laboratories of Spain's National Institute for Proteomics, ProteoRed, a network for the coordination, integration and development of the Spanish proteomics facilities.

Results and discussion: We performed queries against nine databases including seven ProteoRed proteomics laboratories, the NCBI Swiss-Prot database and the local database of the CSIC/UAB Proteomics Laboratory. A detailed analysis of the results indicated the presence of a protein that was supported by other NCBI matches and highly scored matches in several proteomics labs. The analysis clearly indicated that the protein was a relatively high concentrated contaminant that could be present in the ABRF sample. This fact is evident from the information that could be derived from the proposed P2P proteomics system, however it is not straightforward to arrive to the same conclusion by conventional means as it is difficult to discard organic contamination of samples. The actual presence of this contaminant was only stated after the ABRF study of all the identifications reported by the laboratories.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信