S. Mahmoudian, Abdulaziz Yousef, Nasrollah Moghadam Charkari
{"title":"基于PCA和SVR-PHCS的蛋白质相互作用预测","authors":"S. Mahmoudian, Abdulaziz Yousef, Nasrollah Moghadam Charkari","doi":"10.2174/1875036201509010001","DOIUrl":null,"url":null,"abstract":"Protein-Protein Interactions (PPIs) play a key role in many biological systems. Thus, identifying PPIs is critical for understanding cellular processes. Many experimental techniques were applied to predict PPIs. The data extracted using these techniques are incomplete and noisy. In this regard, a number of computational methods include machine learning classification techniques have been developed to reduce the noise data and predict new PPIs. Since, using regression methods to solve classification problems has good results in other applications. Therefore, in this paper, a regression view is applied to the PPI prediction classification problem, so a new approach is proposed using Principal Component Analysis (PCA) and Support Vector Regression (SVR) which has been improved by a new Parallel Hierarchical Cube Search (PHCS) method. Firstly, PCA algorithm is implemented to select an optimal subset of features which leads to reduce processing time and to lessen the effect of noise. Then, the PPIs would be predicted, by using SVR. To get a better performance of SVR, a new PHCS method has been applied to select the appropriate values of SVR parameters. The obtained classification accuracy of the proposed method is 74.505% on KUPS (The University of Kansas Proteomics Service) dataset which outperforms the other methods.","PeriodicalId":38956,"journal":{"name":"Open Bioinformatics Journal","volume":"41 1","pages":"1-12"},"PeriodicalIF":0.0000,"publicationDate":"2015-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Protein-Protein Interaction Prediction using PCA and SVR-PHCS\",\"authors\":\"S. Mahmoudian, Abdulaziz Yousef, Nasrollah Moghadam Charkari\",\"doi\":\"10.2174/1875036201509010001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Protein-Protein Interactions (PPIs) play a key role in many biological systems. Thus, identifying PPIs is critical for understanding cellular processes. Many experimental techniques were applied to predict PPIs. The data extracted using these techniques are incomplete and noisy. In this regard, a number of computational methods include machine learning classification techniques have been developed to reduce the noise data and predict new PPIs. Since, using regression methods to solve classification problems has good results in other applications. Therefore, in this paper, a regression view is applied to the PPI prediction classification problem, so a new approach is proposed using Principal Component Analysis (PCA) and Support Vector Regression (SVR) which has been improved by a new Parallel Hierarchical Cube Search (PHCS) method. Firstly, PCA algorithm is implemented to select an optimal subset of features which leads to reduce processing time and to lessen the effect of noise. Then, the PPIs would be predicted, by using SVR. To get a better performance of SVR, a new PHCS method has been applied to select the appropriate values of SVR parameters. The obtained classification accuracy of the proposed method is 74.505% on KUPS (The University of Kansas Proteomics Service) dataset which outperforms the other methods.\",\"PeriodicalId\":38956,\"journal\":{\"name\":\"Open Bioinformatics Journal\",\"volume\":\"41 1\",\"pages\":\"1-12\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-01-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Open Bioinformatics Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2174/1875036201509010001\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Open Bioinformatics Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/1875036201509010001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 0
摘要
蛋白质-蛋白质相互作用(PPIs)在许多生物系统中起着关键作用。因此,识别ppi对于理解细胞过程至关重要。许多实验技术被应用于预测ppi。使用这些技术提取的数据是不完整和有噪声的。在这方面,已经开发了许多计算方法,包括机器学习分类技术,以减少噪声数据并预测新的ppi。因此,使用回归方法来解决分类问题在其他应用中也有很好的效果。为此,本文将回归的观点应用于PPI预测分类问题,提出了一种基于主成分分析(PCA)和支持向量回归(SVR)的PPI预测分类方法,并在此基础上改进了一种新的并行分层立方搜索(PHCS)方法。首先,采用主成分分析算法选择最优的特征子集,减少处理时间和噪声的影响;然后,利用SVR对ppi进行预测。为了获得更好的SVR性能,采用一种新的PHCS方法来选择合适的SVR参数值。在美国堪萨斯大学蛋白质组学服务(University of Kansas Proteomics Service)数据集上,该方法的分类准确率为74.505%,优于其他方法。
Protein-Protein Interaction Prediction using PCA and SVR-PHCS
Protein-Protein Interactions (PPIs) play a key role in many biological systems. Thus, identifying PPIs is critical for understanding cellular processes. Many experimental techniques were applied to predict PPIs. The data extracted using these techniques are incomplete and noisy. In this regard, a number of computational methods include machine learning classification techniques have been developed to reduce the noise data and predict new PPIs. Since, using regression methods to solve classification problems has good results in other applications. Therefore, in this paper, a regression view is applied to the PPI prediction classification problem, so a new approach is proposed using Principal Component Analysis (PCA) and Support Vector Regression (SVR) which has been improved by a new Parallel Hierarchical Cube Search (PHCS) method. Firstly, PCA algorithm is implemented to select an optimal subset of features which leads to reduce processing time and to lessen the effect of noise. Then, the PPIs would be predicted, by using SVR. To get a better performance of SVR, a new PHCS method has been applied to select the appropriate values of SVR parameters. The obtained classification accuracy of the proposed method is 74.505% on KUPS (The University of Kansas Proteomics Service) dataset which outperforms the other methods.
期刊介绍:
The Open Bioinformatics Journal is an Open Access online journal, which publishes research articles, reviews/mini-reviews, letters, clinical trial studies and guest edited single topic issues in all areas of bioinformatics and computational biology. The coverage includes biomedicine, focusing on large data acquisition, analysis and curation, computational and statistical methods for the modeling and analysis of biological data, and descriptions of new algorithms and databases. The Open Bioinformatics Journal, a peer reviewed journal, is an important and reliable source of current information on the developments in the field. The emphasis will be on publishing quality articles rapidly and freely available worldwide.