ProteomicsPub Date : 2024-05-27DOI: 10.1002/pmic.202400004
Dashleen Kaur, Akanksha Arora, Palani Vigneshwar, Gajendra P. S. Raghava
{"title":"Prediction of peptide hormones using an ensemble of machine learning and similarity-based methods","authors":"Dashleen Kaur, Akanksha Arora, Palani Vigneshwar, Gajendra P. S. Raghava","doi":"10.1002/pmic.202400004","DOIUrl":"10.1002/pmic.202400004","url":null,"abstract":"<p>Peptide hormones serve as genome-encoded signal transduction molecules that play essential roles in multicellular organisms, and their dysregulation can lead to various health problems. In this study, we propose a method for predicting hormonal peptides with high accuracy. The dataset used for training, testing, and evaluating our models consisted of 1174 hormonal and 1174 non-hormonal peptide sequences. Initially, we developed similarity-based methods utilizing BLAST and MERCI software. Although these similarity-based methods provided a high probability of correct prediction, they had limitations, such as no hits or prediction of limited sequences. To overcome these limitations, we further developed machine and deep learning-based models. Our logistic regression-based model achieved a maximum AUROC of 0.93 with an accuracy of 86% on an independent/validation dataset. To harness the power of similarity-based and machine learning-based models, we developed an ensemble method that achieved an AUROC of 0.96 with an accuracy of 89.79% and a Matthews correlation coefficient (MCC) of 0.8 on the validation set. To facilitate researchers in predicting and designing hormone peptides, we developed a web-based server called HOPPred. This server offers a unique feature that allows the identification of hormone-associated motifs within hormone peptides. The server can be accessed at: https://webs.iiitd.edu.in/raghava/hoppred/.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":"24 20","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141157042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ProteomicsPub Date : 2024-05-25DOI: 10.1002/pmic.202300570
Haonan Duan, Zhibin Ning, Ailing Zhang, Daniel Figeys
{"title":"Spectral entropy as a measure of the metaproteome complexity","authors":"Haonan Duan, Zhibin Ning, Ailing Zhang, Daniel Figeys","doi":"10.1002/pmic.202300570","DOIUrl":"10.1002/pmic.202300570","url":null,"abstract":"<p>The diversity and complexity of the microbiome's genomic landscape are not always mirrored in its proteomic profile. Despite the anticipated proteomic diversity, observed complexities of microbiome samples are often lower than expected. Two main factors contribute to this discrepancy: limitations in mass spectrometry's detection sensitivity and bioinformatics challenges in metaproteomics identification. This study introduces a novel approach to evaluating sample complexity directly at the full mass spectrum (MS1) level rather than relying on peptide identifications. When analyzing under identical mass spectrometry conditions, microbiome samples displayed significantly higher complexity, as evidenced by the spectral entropy and peptide candidate entropy, compared to single-species samples. The research provides solid evidence for the complexity of microbiome in proteomics indicating the optimization potential of the bioinformatics workflow.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":"24 16","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/pmic.202300570","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141092433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ProteomicsPub Date : 2024-05-24DOI: 10.1002/pmic.202300607
Bassim El-Sabawi, Shi Huang, Kahraman Tanriverdi, Andrew S. Perry, Kaushik Amancherla, Natalie Jackson, Jenna Hulsey, Jane E. Freedman, Ravi Shah, Brian R. Lindman
{"title":"Capillary blood self-collection for high-throughput proteomics","authors":"Bassim El-Sabawi, Shi Huang, Kahraman Tanriverdi, Andrew S. Perry, Kaushik Amancherla, Natalie Jackson, Jenna Hulsey, Jane E. Freedman, Ravi Shah, Brian R. Lindman","doi":"10.1002/pmic.202300607","DOIUrl":"10.1002/pmic.202300607","url":null,"abstract":"<p>In this study, we sought to compare protein concentrations obtained from a high-throughput proteomics platform (Olink) on samples collected using capillary blood self-collection (with the Tasso+ device) versus standard venipuncture (control). Blood collection was performed on 20 volunteers, including one sample obtained via venipuncture and two via capillary blood using the Tasso+ device. Tasso+ samples were stored at 2°C–8°C for 24-hs (Tasso-24) or 48-h (Tasso-48) prior to processing to simulate shipping times from a study participant's home. Proteomics were analyzed using Olink (384 Inflammatory Panel). Tasso+ blood collection was successful in 37/40 attempts. Of 230 proteins included in our analysis, Pearson correlations (<i>r)</i> and mean coefficient of variation (CV) between Tasso-24 or Tasso-48 versus venipuncture were variable. In the Tasso-24 analysis, 34 proteins (14.8%) had both a correlation <i>r ></i> 0.5 and CV < 0.20. In the Tasso-48 analysis, 68 proteins (29.6%) had a correlation <i>r ></i> 0.5 and CV < 0.20. Combining the Tasso-24 and Tasso-48 analyses, 26 (11.3%) proteins met these thresholds. We concluded that protein concentrations from Tasso+ samples processed 24–48 h after collection demonstrated wide technical variability and variable correlation with a venipuncture gold-standard. Use of home capillary blood self-collection for large-scale proteomics should be limited to select proteins with good agreement with venipuncture.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":"24 16","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/pmic.202300607","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141086383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ProteomicsPub Date : 2024-05-20DOI: 10.1002/pmic.202300644
Samantha J. Emery-Corbin, Jumana M. Yousef, Subash Adhikari, Fransisca Sumardy, Duong Nhu, Mark F. van Delft, Guillaume Lessene, Jerzy Dziekan, Andrew I. Webb, Laura F. Dagley
{"title":"Improved drug target deconvolution with PISA-DIA using an extended, overlapping temperature gradient","authors":"Samantha J. Emery-Corbin, Jumana M. Yousef, Subash Adhikari, Fransisca Sumardy, Duong Nhu, Mark F. van Delft, Guillaume Lessene, Jerzy Dziekan, Andrew I. Webb, Laura F. Dagley","doi":"10.1002/pmic.202300644","DOIUrl":"10.1002/pmic.202300644","url":null,"abstract":"<p>Thermal proteome profiling (TPP) is a powerful tool for drug target deconvolution. Recently, data-independent acquisition mass spectrometry (DIA-MS) approaches have demonstrated significant improvements to depth and missingness in proteome data, but traditional TPP (a.k.a. CEllular Thermal Shift Assay “CETSA”) workflows typically employ multiplexing reagents reliant on data-dependent acquisition (DDA). Herein, we introduce a new experimental design for the Proteome Integral Solubility Alteration via label-free DIA approach (PISA-DIA). We highlight the proteome coverage and sensitivity achieved by using multiple overlapping thermal gradients alongside DIA-MS, which maximizes efficiencies in PISA sample concatenation and safeguards against missing protein targets that exist at high melting temperatures. We demonstrate our extended PISA-DIA design has superior proteome coverage as compared to using tandem-mass tags (TMT) necessitating DDA-MS analysis. Importantly, we demonstrate our PISA-DIA approach has the quantitative and statistical rigor using A-1331852, a specific inhibitor of BCL-xL. Due to the high melt temperature of this protein target, we utilized our extended multiple gradient PISA-DIA workflow to identify BCL-xL. We assert our novel overlapping gradient PISA-DIA-MS approach is ideal for unbiased drug target deconvolution, spanning a large temperature range whilst minimizing target dropout between gradients, increasing the likelihood of resolving the protein targets of novel compounds.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":"24 16","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/pmic.202300644","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141064275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ProteomicsPub Date : 2024-05-14DOI: 10.1002/pmic.202300280
Tomas Erban, Bruno Sopko
{"title":"Understanding bacterial pathogen diversity: A proteogenomic analysis and use of an array of genome assemblies to identify novel virulence factors of the honey bee bacterial pathogen Paenibacillus larvae","authors":"Tomas Erban, Bruno Sopko","doi":"10.1002/pmic.202300280","DOIUrl":"10.1002/pmic.202300280","url":null,"abstract":"<p>Mass spectrometry proteomics data are typically evaluated against publicly available annotated sequences, but the proteogenomics approach is a useful alternative. A single genome is commonly utilized in custom proteomic and proteogenomic data analysis. We pose the question of whether utilizing numerous different genome assemblies in a search database would be beneficial. We reanalyzed raw data from the exoprotein fraction of four reference Enterobacterial Repetitive Intergenic Consensus (ERIC) I–IV genotypes of the honey bee bacterial pathogen <i>Paenibacillus larvae</i> and evaluated them against three reference databases (from NCBI-protein, RefSeq, and UniProt) together with an array of protein sequences generated by six-frame direct translation of 15 genome assemblies from GenBank. The wide search yielded 453 protein hits/groups, which UpSet analysis categorized into 50 groups based on the success of protein identification by the 18 database components. Nine hits that were not identified by a unique peptide were not considered for marker selection, which discarded the only protein that was not identified by the reference databases. We propose that the variability in successful identifications between genome assemblies is useful for marker mining. The results suggest that various strains of <i>P. larvae</i> can exhibit specific traits that set them apart from the established genotypes ERIC I–V.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":"24 14","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/pmic.202300280","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140920261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ProteomicsPub Date : 2024-05-10DOI: 10.1002/pmic.202300210
Yong Chiang Tan, Teck Yew Low, Pey Yee Lee, Lay Cheng Lim
{"title":"Single-cell proteomics by mass spectrometry: Advances and implications in cancer research","authors":"Yong Chiang Tan, Teck Yew Low, Pey Yee Lee, Lay Cheng Lim","doi":"10.1002/pmic.202300210","DOIUrl":"10.1002/pmic.202300210","url":null,"abstract":"<p>Cancer harbours extensive proteomic heterogeneity. Inspired by the prior success of single-cell RNA sequencing (scRNA-seq) in characterizing minute transcriptomics heterogeneity in cancer, researchers are now actively searching for information regarding the proteomics counterpart. Therefore recently, single-cell proteomics by mass spectrometry (SCP) has rapidly developed into state-of-the-art technology to cater the need. This review aims to summarize application of SCP in cancer research, while revealing current development progress of SCP technology. The review also aims to contribute ideas into research gaps and future directions, ultimately promoting the application of SCP in cancer research.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":"24 12-13","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140896365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantitative proteomics investigating the intrinsic adaptation mechanism of Aeromonas hydrophila to streptomycin","authors":"Shuangziying Zhang, Wenxiao Yang, Yuyue Xie, Xinrui Zhao, Haoyu Chen, Lishan Zhang, Xiangmin Lin","doi":"10.1002/pmic.202300383","DOIUrl":"10.1002/pmic.202300383","url":null,"abstract":"<p><i>Aeromonas hydrophila</i>, a prevalent pathogen in the aquaculture industry, poses significant challenges due to its drug-resistant strains. Moreover, residues of antibiotics like streptomycin, extensively employed in aquaculture settings, drive selective bacterial evolution, leading to the progressive development of resistance to this agent. However, the underlying mechanism of its intrinsic adaptation to antibiotics remains elusive. Here, we employed a quantitative proteomics approach to investigate the differences in protein expression between <i>A. hydrophila</i> under streptomycin (SM) stress and nonstress conditions. Notably, bioinformatics analysis unveiled the potential involvement of metal pathways, including metal cluster binding, iron-sulfur cluster binding, and transition metal ion binding, in influencing <i>A. hydrophila</i>’<i>s</i> resistance to SM. Furthermore, we evaluated the sensitivity of eight gene deletion strains related to streptomycin and observed the potential roles of petA and AHA_4705 in SM resistance. Collectively, our findings enhance the understanding of <i>A. hydrophila</i>’<i>s</i> response behavior to streptomycin stress and shed light on its intrinsic adaptation mechanism.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":"24 19","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140828467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ProteomicsPub Date : 2024-05-03DOI: 10.1002/pmic.202300351
Inga Popova, Ekaterina Savelyeva, Tatyana Degtyarevskaya, Dmitrii Babaskin, Andrei Vokhmintsev
{"title":"Evaluation of proteome dynamics: Implications for statistical confidence in mass spectrometric determination","authors":"Inga Popova, Ekaterina Savelyeva, Tatyana Degtyarevskaya, Dmitrii Babaskin, Andrei Vokhmintsev","doi":"10.1002/pmic.202300351","DOIUrl":"10.1002/pmic.202300351","url":null,"abstract":"<p>Single-cell proteomics is currently far less productive than other approaches. Still, the proteomic community is having trouble adapting to the limitation of having to examine fewer cells than they would like. Studies on a small number of cells should be carefully planned to maximize the chances of success in this situation. This study aims to determine how sample size and measurement speed (slope)/variation affect the accuracy of a protein proteome mass spectrometric determination. The determination accuracy was shown to increase, and the false positive rate was shown to decrease as the sample size increased from 7 to 100 cells and the measurement slope/variation (S/V) ratio increased from 1 to 6. Furthermore, it was discovered that the number of cells in the sample increased the accuracy of this estimate. Thus, for 100 cells, the measurement S/V ratio was typically estimated to be very close to the real-world value, with a standard deviation of 0.35. For sample sizes from 7 to 100 cells, this accuracy was seen when calculating the measurement S/V ratio. The findings can help researchers plan experiments for mass spectroscopic protein proteome determination and other research purposes.</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":"24 14","pages":""},"PeriodicalIF":3.4,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140828471","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}