Lan Wang, Da Xie, Zhongyu Zhou, Shengli Zhang, Taotao Wang
{"title":"基于支持向量机的串联质谱数据预处理","authors":"Lan Wang, Da Xie, Zhongyu Zhou, Shengli Zhang, Taotao Wang","doi":"10.1109/WCNCW.2018.8369022","DOIUrl":null,"url":null,"abstract":"The analysis method based on tandem mass spectrometry plays a leading role in protein identification, with which a large number of mass spectra can be generated in a short time. However, there is always more or less noise in each spectrum. The noise will extend the database search time and interfere the results of mass spectrometry. In this paper, we proposed a new preprocessing method based on SVM to solve the problem. In order to distinguish the noise peaks before removing them from the signal peaks, our method first carefully selects 25 features based on the real data test results. After that, peaks satisfy these 25 features are feed to a SVM (support vector machine) model for final identification. The experimental results showed that the proposed method can effectively increase the efficiency of spectral analysis and the number of peptide and protein identification.","PeriodicalId":104921,"journal":{"name":"WCNC Workshops","volume":"188 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data preprocessing in tandem mass spectra based on SVM\",\"authors\":\"Lan Wang, Da Xie, Zhongyu Zhou, Shengli Zhang, Taotao Wang\",\"doi\":\"10.1109/WCNCW.2018.8369022\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The analysis method based on tandem mass spectrometry plays a leading role in protein identification, with which a large number of mass spectra can be generated in a short time. However, there is always more or less noise in each spectrum. The noise will extend the database search time and interfere the results of mass spectrometry. In this paper, we proposed a new preprocessing method based on SVM to solve the problem. In order to distinguish the noise peaks before removing them from the signal peaks, our method first carefully selects 25 features based on the real data test results. After that, peaks satisfy these 25 features are feed to a SVM (support vector machine) model for final identification. The experimental results showed that the proposed method can effectively increase the efficiency of spectral analysis and the number of peptide and protein identification.\",\"PeriodicalId\":104921,\"journal\":{\"name\":\"WCNC Workshops\",\"volume\":\"188 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"WCNC Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/WCNCW.2018.8369022\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"WCNC Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WCNCW.2018.8369022","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data preprocessing in tandem mass spectra based on SVM
The analysis method based on tandem mass spectrometry plays a leading role in protein identification, with which a large number of mass spectra can be generated in a short time. However, there is always more or less noise in each spectrum. The noise will extend the database search time and interfere the results of mass spectrometry. In this paper, we proposed a new preprocessing method based on SVM to solve the problem. In order to distinguish the noise peaks before removing them from the signal peaks, our method first carefully selects 25 features based on the real data test results. After that, peaks satisfy these 25 features are feed to a SVM (support vector machine) model for final identification. The experimental results showed that the proposed method can effectively increase the efficiency of spectral analysis and the number of peptide and protein identification.