{"title":"基于ICA特征提取的蛋白质二级结构预测","authors":"J. Melo, George D. C. Cavalcanti, K. Guimaraes","doi":"10.1109/NNSP.2003.1318000","DOIUrl":null,"url":null,"abstract":"An original application of the independent component analysis (ICA) is presented in this work. This linear transformation method is used for feature extraction for a machine learning approach to the protein secondary structure prediction problem. PSI-blast profiles, built on NCBI's nonredundant protein database, have their dimensionality reduced through ICA method. The resulting components are used as input data to three artificial neural networks with 30, 35 or 40 nodes in the hidden layer. Those classifiers are trained with the RPROP algorithm and five rules are used for the combination of their outputs. The results achieved are compared with the best ones recently obtained in similar conditions, including experiments using principal component analysis (PCA) as feature extraction method, presenting the best result. The performance of each network individually achieved a Q/sub 3/ accuracy of 74.1% on average, using only 120 independent components. When the networks are combined with the product rule the performance achieved is 75.2%. This result is overcome only when the raw data are informed to the networks, when an accuracy of 75.9% is achieved.","PeriodicalId":315958,"journal":{"name":"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2003-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Protein secondary structure prediction with ICA feature extraction\",\"authors\":\"J. Melo, George D. C. Cavalcanti, K. Guimaraes\",\"doi\":\"10.1109/NNSP.2003.1318000\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An original application of the independent component analysis (ICA) is presented in this work. This linear transformation method is used for feature extraction for a machine learning approach to the protein secondary structure prediction problem. PSI-blast profiles, built on NCBI's nonredundant protein database, have their dimensionality reduced through ICA method. The resulting components are used as input data to three artificial neural networks with 30, 35 or 40 nodes in the hidden layer. Those classifiers are trained with the RPROP algorithm and five rules are used for the combination of their outputs. The results achieved are compared with the best ones recently obtained in similar conditions, including experiments using principal component analysis (PCA) as feature extraction method, presenting the best result. The performance of each network individually achieved a Q/sub 3/ accuracy of 74.1% on average, using only 120 independent components. When the networks are combined with the product rule the performance achieved is 75.2%. This result is overcome only when the raw data are informed to the networks, when an accuracy of 75.9% is achieved.\",\"PeriodicalId\":315958,\"journal\":{\"name\":\"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2003-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NNSP.2003.1318000\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2003 IEEE XIII Workshop on Neural Networks for Signal Processing (IEEE Cat. No.03TH8718)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NNSP.2003.1318000","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Protein secondary structure prediction with ICA feature extraction
An original application of the independent component analysis (ICA) is presented in this work. This linear transformation method is used for feature extraction for a machine learning approach to the protein secondary structure prediction problem. PSI-blast profiles, built on NCBI's nonredundant protein database, have their dimensionality reduced through ICA method. The resulting components are used as input data to three artificial neural networks with 30, 35 or 40 nodes in the hidden layer. Those classifiers are trained with the RPROP algorithm and five rules are used for the combination of their outputs. The results achieved are compared with the best ones recently obtained in similar conditions, including experiments using principal component analysis (PCA) as feature extraction method, presenting the best result. The performance of each network individually achieved a Q/sub 3/ accuracy of 74.1% on average, using only 120 independent components. When the networks are combined with the product rule the performance achieved is 75.2%. This result is overcome only when the raw data are informed to the networks, when an accuracy of 75.9% is achieved.