{"title":"Interaction prediction of PDZ domains using a machine learning approach","authors":"Sibel Kalyoncu, O. Keskin, A. Gursoy","doi":"10.1109/HIBIT.2010.5478896","DOIUrl":null,"url":null,"abstract":"Protein interaction domains play crucial roles in many complex cellular pathways. PDZ domains are one of the most common protein interaction domains. Prediction of binding specificity of PDZ domains by a computational manner could eliminate unnecessary, time-consuming experiments. In this study, interactions of PDZ domains are predicted by using a machine learning approach in which only primary sequences of PDZ domains and peptides are used. In order to encode feature vectors for each interaction, trigram frequencies of primary sequences of PDZ domains and corresponding peptides are calculated. After construction of numerical interaction dataset, we compared different classifiers and ended up with Random Forest (RF) algorithm which gave the top performance. We obtained very high prediction accuracy (91.4%) for binary interaction prediction which outperforms all previous similar methods.","PeriodicalId":215457,"journal":{"name":"2010 5th International Symposium on Health Informatics and Bioinformatics","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 5th International Symposium on Health Informatics and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HIBIT.2010.5478896","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Protein interaction domains play crucial roles in many complex cellular pathways. PDZ domains are one of the most common protein interaction domains. Prediction of binding specificity of PDZ domains by a computational manner could eliminate unnecessary, time-consuming experiments. In this study, interactions of PDZ domains are predicted by using a machine learning approach in which only primary sequences of PDZ domains and peptides are used. In order to encode feature vectors for each interaction, trigram frequencies of primary sequences of PDZ domains and corresponding peptides are calculated. After construction of numerical interaction dataset, we compared different classifiers and ended up with Random Forest (RF) algorithm which gave the top performance. We obtained very high prediction accuracy (91.4%) for binary interaction prediction which outperforms all previous similar methods.