Armand Paauw, Evgeni Levin, Ingrid A I Voskamp-Visser, Ilka M F Marissen, Vincent Ramisse, Marine Eschlimann, Jiří Dresler, Petr Pajer, Christoph Stingl, Hans C van Leeuwen, Theo M Luider, Luc M Hornstra
{"title":"P4PP: A Universal Shotgun Proteomics Data Analysis Pipeline for Virus Identification.","authors":"Armand Paauw, Evgeni Levin, Ingrid A I Voskamp-Visser, Ilka M F Marissen, Vincent Ramisse, Marine Eschlimann, Jiří Dresler, Petr Pajer, Christoph Stingl, Hans C van Leeuwen, Theo M Luider, Luc M Hornstra","doi":"10.1016/j.mcpro.2025.101004","DOIUrl":null,"url":null,"abstract":"<p><p>Humans can be infected by a wide variety of virus species. We developed a data analysis approach for shotgun proteomic data to detect these viruses. A proteome for pandemic preparedness (P4PP) pipeline, a corresponding database (P4PP v01), and a web application (P4PP) were constructed. The P4PP pipeline enables the identification of 1896 virus species from the 32 virus families, based on multiple identified discriminatory peptides, in which at least one human infectious virus is described. P4PP was evaluated using different datasets of cell-cultivated viruses, generated at different institutes, measured with different instruments, and prepared with different sample preparation methods. In total, 174 mass spectrometry datasets of 160 and 14 protein trypsin digests of virus-infected and noninfected cell lines were analyzed, respectively. Of the 160 samples, 146 were correctly identified at the species level, and an additional four samples were identified at the family level. In the remaining 10 samples, no virus was detected. However, all these 10 samples tested positive in follow-up samples obtained later in time series were negative samples were measured, indicating that the number of peptides derived from the virus was initially too low in the samples obtained at the start of the experiment. Furthermore, results show that influenza A or severe acute respiratory syndrome coronavirus 2 can be subtyped if enough discriminative peptides of the virus are identified. In the noninfected cell lines, no virus was detected except in one sample where the in that experiment studied virus was detected. Shotgun proteomics, in combination with the developed data analysis approach, can identify all types of virus species after cultivation in a cell line. Implementing this agnostic virus proteome analysis capability in viral diagnostic laboratories has the potential to improve their capabilities to cope with unexpected, mutated, or re-emerging viruses.</p>","PeriodicalId":18712,"journal":{"name":"Molecular & Cellular Proteomics","volume":" ","pages":"101004"},"PeriodicalIF":5.5000,"publicationDate":"2025-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12418414/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular & Cellular Proteomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.mcpro.2025.101004","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/5/29 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Humans can be infected by a wide variety of virus species. We developed a data analysis approach for shotgun proteomic data to detect these viruses. A proteome for pandemic preparedness (P4PP) pipeline, a corresponding database (P4PP v01), and a web application (P4PP) were constructed. The P4PP pipeline enables the identification of 1896 virus species from the 32 virus families, based on multiple identified discriminatory peptides, in which at least one human infectious virus is described. P4PP was evaluated using different datasets of cell-cultivated viruses, generated at different institutes, measured with different instruments, and prepared with different sample preparation methods. In total, 174 mass spectrometry datasets of 160 and 14 protein trypsin digests of virus-infected and noninfected cell lines were analyzed, respectively. Of the 160 samples, 146 were correctly identified at the species level, and an additional four samples were identified at the family level. In the remaining 10 samples, no virus was detected. However, all these 10 samples tested positive in follow-up samples obtained later in time series were negative samples were measured, indicating that the number of peptides derived from the virus was initially too low in the samples obtained at the start of the experiment. Furthermore, results show that influenza A or severe acute respiratory syndrome coronavirus 2 can be subtyped if enough discriminative peptides of the virus are identified. In the noninfected cell lines, no virus was detected except in one sample where the in that experiment studied virus was detected. Shotgun proteomics, in combination with the developed data analysis approach, can identify all types of virus species after cultivation in a cell line. Implementing this agnostic virus proteome analysis capability in viral diagnostic laboratories has the potential to improve their capabilities to cope with unexpected, mutated, or re-emerging viruses.
期刊介绍:
The mission of MCP is to foster the development and applications of proteomics in both basic and translational research. MCP will publish manuscripts that report significant new biological or clinical discoveries underpinned by proteomic observations across all kingdoms of life. Manuscripts must define the biological roles played by the proteins investigated or their mechanisms of action.
The journal also emphasizes articles that describe innovative new computational methods and technological advancements that will enable future discoveries. Manuscripts describing such approaches do not have to include a solution to a biological problem, but must demonstrate that the technology works as described, is reproducible and is appropriate to uncover yet unknown protein/proteome function or properties using relevant model systems or publicly available data.
Scope:
-Fundamental studies in biology, including integrative "omics" studies, that provide mechanistic insights
-Novel experimental and computational technologies
-Proteogenomic data integration and analysis that enable greater understanding of physiology and disease processes
-Pathway and network analyses of signaling that focus on the roles of post-translational modifications
-Studies of proteome dynamics and quality controls, and their roles in disease
-Studies of evolutionary processes effecting proteome dynamics, quality and regulation
-Chemical proteomics, including mechanisms of drug action
-Proteomics of the immune system and antigen presentation/recognition
-Microbiome proteomics, host-microbe and host-pathogen interactions, and their roles in health and disease
-Clinical and translational studies of human diseases
-Metabolomics to understand functional connections between genes, proteins and phenotypes