{"title":"迈向集中的质谱/质谱预处理:使用地面真实数据集的肽搜索引擎的经验评估","authors":"Majdi Maabreh, Ajay K. Gupta, I. Alsmadi","doi":"10.1109/BIBE.2017.00-56","DOIUrl":null,"url":null,"abstract":"several peptides search engines have been developed in the recent decades. Most of the time and for the same inputs, different search engines’ result in different peptides were identified, which can confuse the stakeholders in the field of proteomics. The massive amount of generated spectra by high throughput spectrometers adds another challenge which handicaps the current search engines. This motivates the researchers to evaluate the combination of several search engines. Several studies provided ensemble solutions over shared and distributed computing environments for reliable results. However, the massive amount of MS/MS spectra is a cumbersome traffic over the systems’ networks. This issue directly impacts the searching performance and also adds unnecessary extra costs (computing, storage, network traffic) if cloud cluster is being used. The main question of this paper is: Can we build a central MS/MS spectra preprocessing for semantically different protein search engines? We evaluate different statistical reduction techniques using four popular protein search engines. In order to fairly evaluate the results, we build ground truth unanimous-based datasets for two different species; yeast and human. Our techniques result in significant peak reduction, where only around 30% of the spectra peaks are enough to report reliable identifications from the used search engines in this study.","PeriodicalId":262603,"journal":{"name":"2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Towards Centralized MS/MS Spectra Preprocessing: An Empirical Evaluation of Peptides Search Engines using Ground Truth Datasets\",\"authors\":\"Majdi Maabreh, Ajay K. Gupta, I. Alsmadi\",\"doi\":\"10.1109/BIBE.2017.00-56\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"several peptides search engines have been developed in the recent decades. Most of the time and for the same inputs, different search engines’ result in different peptides were identified, which can confuse the stakeholders in the field of proteomics. The massive amount of generated spectra by high throughput spectrometers adds another challenge which handicaps the current search engines. This motivates the researchers to evaluate the combination of several search engines. Several studies provided ensemble solutions over shared and distributed computing environments for reliable results. However, the massive amount of MS/MS spectra is a cumbersome traffic over the systems’ networks. This issue directly impacts the searching performance and also adds unnecessary extra costs (computing, storage, network traffic) if cloud cluster is being used. The main question of this paper is: Can we build a central MS/MS spectra preprocessing for semantically different protein search engines? We evaluate different statistical reduction techniques using four popular protein search engines. In order to fairly evaluate the results, we build ground truth unanimous-based datasets for two different species; yeast and human. Our techniques result in significant peak reduction, where only around 30% of the spectra peaks are enough to report reliable identifications from the used search engines in this study.\",\"PeriodicalId\":262603,\"journal\":{\"name\":\"2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBE.2017.00-56\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE 17th International Conference on Bioinformatics and Bioengineering (BIBE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2017.00-56","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Towards Centralized MS/MS Spectra Preprocessing: An Empirical Evaluation of Peptides Search Engines using Ground Truth Datasets
several peptides search engines have been developed in the recent decades. Most of the time and for the same inputs, different search engines’ result in different peptides were identified, which can confuse the stakeholders in the field of proteomics. The massive amount of generated spectra by high throughput spectrometers adds another challenge which handicaps the current search engines. This motivates the researchers to evaluate the combination of several search engines. Several studies provided ensemble solutions over shared and distributed computing environments for reliable results. However, the massive amount of MS/MS spectra is a cumbersome traffic over the systems’ networks. This issue directly impacts the searching performance and also adds unnecessary extra costs (computing, storage, network traffic) if cloud cluster is being used. The main question of this paper is: Can we build a central MS/MS spectra preprocessing for semantically different protein search engines? We evaluate different statistical reduction techniques using four popular protein search engines. In order to fairly evaluate the results, we build ground truth unanimous-based datasets for two different species; yeast and human. Our techniques result in significant peak reduction, where only around 30% of the spectra peaks are enough to report reliable identifications from the used search engines in this study.