Pushpendra Singh Gahlot, Shubham Choudhury, Nisha Bajiya, Nishant Kumar, Gajendra P S Raghava
{"title":"Prediction of Plant Resistance Proteins Using Alignment-Based and Alignment-Free Approaches.","authors":"Pushpendra Singh Gahlot, Shubham Choudhury, Nisha Bajiya, Nishant Kumar, Gajendra P S Raghava","doi":"10.1002/pmic.202400261","DOIUrl":null,"url":null,"abstract":"<p><p>Plant disease resistance (PDR) proteins are critical in identifying plant pathogens. Predicting PDR protein is essential for understanding plant-pathogen interactions and developing strategies for crop protection. This study proposes a hybrid model for predicting and designing PDR proteins against plant-invading pathogens. Initially, we tried alignment-based approaches, such as Basic Local Alignment Search Tool (BLAST) for similarity search and MERCI for motif search. These alignment-based approaches exhibit very poor coverage or sensitivity. To overcome these limitations, we developed alignment-free or machine learning (ML)-based methods using compositional features of proteins. Our ML-based model, developed using compositional features of proteins, achieved a maximum performance area under the receiver operating characteristic curve (AUROC) of 0.91. The performance of our model improved significantly from AUROC of 0.91-0.95 when we used evolutionary information instead of protein sequence. Finally, we developed a hybrid or ensemble model that combined our best ML model with BLAST and obtained the highest AUROC of 0.98 on the validation dataset. We trained and tested our models on a training dataset and evaluated them on a validation dataset. None of the proteins in our validation dataset are more than 40% similar to proteins in the training dataset. One of the objectives of this study is to facilitate the scientific community working in plant biology. Thus, we developed an online platform for predicting and designing plant resistance proteins, \"PlantDRPpred\" (https://webs.iiitd.edu.in/raghava/plantdrppred).</p>","PeriodicalId":224,"journal":{"name":"Proteomics","volume":" ","pages":"e202400261"},"PeriodicalIF":3.4000,"publicationDate":"2024-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proteomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/pmic.202400261","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Plant disease resistance (PDR) proteins are critical in identifying plant pathogens. Predicting PDR protein is essential for understanding plant-pathogen interactions and developing strategies for crop protection. This study proposes a hybrid model for predicting and designing PDR proteins against plant-invading pathogens. Initially, we tried alignment-based approaches, such as Basic Local Alignment Search Tool (BLAST) for similarity search and MERCI for motif search. These alignment-based approaches exhibit very poor coverage or sensitivity. To overcome these limitations, we developed alignment-free or machine learning (ML)-based methods using compositional features of proteins. Our ML-based model, developed using compositional features of proteins, achieved a maximum performance area under the receiver operating characteristic curve (AUROC) of 0.91. The performance of our model improved significantly from AUROC of 0.91-0.95 when we used evolutionary information instead of protein sequence. Finally, we developed a hybrid or ensemble model that combined our best ML model with BLAST and obtained the highest AUROC of 0.98 on the validation dataset. We trained and tested our models on a training dataset and evaluated them on a validation dataset. None of the proteins in our validation dataset are more than 40% similar to proteins in the training dataset. One of the objectives of this study is to facilitate the scientific community working in plant biology. Thus, we developed an online platform for predicting and designing plant resistance proteins, "PlantDRPpred" (https://webs.iiitd.edu.in/raghava/plantdrppred).
期刊介绍:
PROTEOMICS is the premier international source for information on all aspects of applications and technologies, including software, in proteomics and other "omics". The journal includes but is not limited to proteomics, genomics, transcriptomics, metabolomics and lipidomics, and systems biology approaches. Papers describing novel applications of proteomics and integration of multi-omics data and approaches are especially welcome.