迈向集中的质谱/质谱预处理:使用地面真实数据集的肽搜索引擎的经验评估

2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE) Pub Date : 2017-10-01 DOI:10.1109/BIBE.2017.00-56

Majdi Maabreh, Ajay K. Gupta, I. Alsmadi

{"title":"迈向集中的质谱/质谱预处理:使用地面真实数据集的肽搜索引擎的经验评估","authors":"Majdi Maabreh, Ajay K. Gupta, I. Alsmadi","doi":"10.1109/BIBE.2017.00-56","DOIUrl":null,"url":null,"abstract":"several peptides search engines have been developed in the recent decades. Most of the time and for the same inputs, different search enginesâ€™ result in different peptides were identified, which can confuse the stakeholders in the field of proteomics. The massive amount of generated spectra by high throughput spectrometers adds another challenge which handicaps the current search engines. This motivates the researchers to evaluate the combination of several search engines. Several studies provided ensemble solutions over shared and distributed computing environments for reliable results. However, the massive amount of MS/MS spectra is a cumbersome traffic over the systemsâ€™ networks. This issue directly impacts the searching performance and also adds unnecessary extra costs (computing, storage, network traffic) if cloud cluster is being used. The main question of this paper is: Can we build a central MS/MS spectra preprocessing for semantically different protein search engines? We evaluate different statistical reduction techniques using four popular protein search engines. In order to fairly evaluate the results, we build ground truth unanimous-based datasets for two different species; yeast and human. Our techniques result in significant peak reduction, where only around 30% of the spectra peaks are enough to report reliable identifications from the used search engines in this study.","PeriodicalId":262603,"journal":{"name":"2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Towards Centralized MS/MS Spectra Preprocessing: An Empirical Evaluation of Peptides Search Engines using Ground Truth Datasets\",\"authors\":\"Majdi Maabreh, Ajay K. Gupta, I. Alsmadi\",\"doi\":\"10.1109/BIBE.2017.00-56\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"several peptides search engines have been developed in the recent decades. Most of the time and for the same inputs, different search enginesâ€™ result in different peptides were identified, which can confuse the stakeholders in the field of proteomics. The massive amount of generated spectra by high throughput spectrometers adds another challenge which handicaps the current search engines. This motivates the researchers to evaluate the combination of several search engines. Several studies provided ensemble solutions over shared and distributed computing environments for reliable results. However, the massive amount of MS/MS spectra is a cumbersome traffic over the systemsâ€™ networks. This issue directly impacts the searching performance and also adds unnecessary extra costs (computing, storage, network traffic) if cloud cluster is being used. The main question of this paper is: Can we build a central MS/MS spectra preprocessing for semantically different protein search engines? We evaluate different statistical reduction techniques using four popular protein search engines. In order to fairly evaluate the results, we build ground truth unanimous-based datasets for two different species; yeast and human. Our techniques result in significant peak reduction, where only around 30% of the spectra peaks are enough to report reliable identifications from the used search engines in this study.\",\"PeriodicalId\":262603,\"journal\":{\"name\":\"2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBE.2017.00-56\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2017.00-56","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

近几十年来，已经开发了几种肽搜索引擎。大多数情况下，对于相同的输入，不同的搜索引擎会识别出不同的肽，这可能会使蛋白质组学领域的利益相关者感到困惑。高通量光谱仪产生的大量光谱给当前的搜索引擎带来了另一个挑战。这促使研究人员评估几个搜索引擎的组合。一些研究提供了共享和分布式计算环境上的集成解决方案，以获得可靠的结果。然而，大量的MS/MS光谱是系统€™网络上的麻烦流量。这个问题直接影响搜索性能，如果使用云集群，还会增加不必要的额外成本(计算、存储、网络流量)。本文的主要问题是:我们能否为语义不同的蛋白质搜索引擎建立一个中央MS/MS谱预处理?我们使用四种流行的蛋白质搜索引擎评估不同的统计约简技术。为了公平地评估结果，我们为两个不同的物种建立了基于事实一致的数据集;酵母和人。我们的技术导致显著的峰降低，其中只有大约30%的光谱峰足以报告可靠的识别从本研究中使用的搜索引擎。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Towards Centralized MS/MS Spectra Preprocessing: An Empirical Evaluation of Peptides Search Engines using Ground Truth Datasets

several peptides search engines have been developed in the recent decades. Most of the time and for the same inputs, different search enginesâ€™ result in different peptides were identified, which can confuse the stakeholders in the field of proteomics. The massive amount of generated spectra by high throughput spectrometers adds another challenge which handicaps the current search engines. This motivates the researchers to evaluate the combination of several search engines. Several studies provided ensemble solutions over shared and distributed computing environments for reliable results. However, the massive amount of MS/MS spectra is a cumbersome traffic over the systemsâ€™ networks. This issue directly impacts the searching performance and also adds unnecessary extra costs (computing, storage, network traffic) if cloud cluster is being used. The main question of this paper is: Can we build a central MS/MS spectra preprocessing for semantically different protein search engines? We evaluate different statistical reduction techniques using four popular protein search engines. In order to fairly evaluate the results, we build ground truth unanimous-based datasets for two different species; yeast and human. Our techniques result in significant peak reduction, where only around 30% of the spectra peaks are enough to report reliable identifications from the used search engines in this study.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE)

自引率

0.00%

发文量