{"title":"基于PsePSSM特征表示的支持向量机FC特征选择预测DNA结合蛋白","authors":"Achmad Ridok","doi":"10.1145/3427423.3427462","DOIUrl":null,"url":null,"abstract":"DNA binding protein (DBP) plays an important role in various biological processes including DNA replication, recombination, and repair. Because of its important role in various biological activities, identification of DBP is a challenge to continue to be developed. DPB identification was initially carried out by the experimental method. However, this method is expensive and takes a lot of time. For this reason, in the last decades machine-based learning methods have been developed. Although several machine learning-based prediction methods have been developed. Research in this field is still open to continuously improving its performance. One of the efforts to improve the prediction performance of DBP is by selecting the appropriate feature vector extraction algorithm from amino acid sequences. In this paper we have used PsePSSM as feature representation and SVM with the RBF kernel combined with FC feature selection as a predictive model. Determination of the best performance is facilitated by evaluating the parameters of PsePSSM, SVM and FC. The results of the evaluation of the best performance parameters achieved an accuracy of 79.45% and AUC of 79.6%.","PeriodicalId":120194,"journal":{"name":"Proceedings of the 5th International Conference on Sustainable Information Engineering and Technology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Prediction of DNA binding protein using FC feature selection in SVM with PsePSSM feature representation\",\"authors\":\"Achmad Ridok\",\"doi\":\"10.1145/3427423.3427462\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"DNA binding protein (DBP) plays an important role in various biological processes including DNA replication, recombination, and repair. Because of its important role in various biological activities, identification of DBP is a challenge to continue to be developed. DPB identification was initially carried out by the experimental method. However, this method is expensive and takes a lot of time. For this reason, in the last decades machine-based learning methods have been developed. Although several machine learning-based prediction methods have been developed. Research in this field is still open to continuously improving its performance. One of the efforts to improve the prediction performance of DBP is by selecting the appropriate feature vector extraction algorithm from amino acid sequences. In this paper we have used PsePSSM as feature representation and SVM with the RBF kernel combined with FC feature selection as a predictive model. Determination of the best performance is facilitated by evaluating the parameters of PsePSSM, SVM and FC. The results of the evaluation of the best performance parameters achieved an accuracy of 79.45% and AUC of 79.6%.\",\"PeriodicalId\":120194,\"journal\":{\"name\":\"Proceedings of the 5th International Conference on Sustainable Information Engineering and Technology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 5th International Conference on Sustainable Information Engineering and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3427423.3427462\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Conference on Sustainable Information Engineering and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3427423.3427462","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Prediction of DNA binding protein using FC feature selection in SVM with PsePSSM feature representation
DNA binding protein (DBP) plays an important role in various biological processes including DNA replication, recombination, and repair. Because of its important role in various biological activities, identification of DBP is a challenge to continue to be developed. DPB identification was initially carried out by the experimental method. However, this method is expensive and takes a lot of time. For this reason, in the last decades machine-based learning methods have been developed. Although several machine learning-based prediction methods have been developed. Research in this field is still open to continuously improving its performance. One of the efforts to improve the prediction performance of DBP is by selecting the appropriate feature vector extraction algorithm from amino acid sequences. In this paper we have used PsePSSM as feature representation and SVM with the RBF kernel combined with FC feature selection as a predictive model. Determination of the best performance is facilitated by evaluating the parameters of PsePSSM, SVM and FC. The results of the evaluation of the best performance parameters achieved an accuracy of 79.45% and AUC of 79.6%.