{"title":"A DNA-Binding Proteins Prediction Model Using Different Property Distance Transformation","authors":"Xiangyu Li, Lina Yang, Y. Tang, P. Wang","doi":"10.1109/ICWAPR51924.2020.9494609","DOIUrl":null,"url":null,"abstract":"DNA-binding proteins refers to a class of proteins that can combine with DNA to produce complexes. It is an indispensable part of cell life activities, such as DNA recombination, modification, replication, virus integration and transcription. With the rapid development of gene sequencing technology and the increasing demand for sequencing technology, more and more unknown DNA-binding proteins are waiting for researchers to predict. However, develop a high quality and short time prediction method still face more challenges. In this article the author puts forward a new method named PSFM-DDT, which combines the Position Specific Frequency Matrix(PSFM) and Different Property Distance Transformation(DDT). Firstly, the evolutionary information of protein sequence was expressed by frequency matrix, and then using distance transformation of different amino acids is transformed into a series of new feature vector. The extracted vectors features are trained by using Support Vector Machine(SVM) linear kernel method to choice the last model. The accuracy of this method reached 83.16% using jackknife test on the benchmark dataset and 79.57% on the independent dataset. Through the experimental results indicated that performance of this method obtain significantly improved compared with other prediction method.","PeriodicalId":111814,"journal":{"name":"2020 International Conference on Wavelet Analysis and Pattern Recognition (ICWAPR)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Wavelet Analysis and Pattern Recognition (ICWAPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWAPR51924.2020.9494609","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
DNA-binding proteins refers to a class of proteins that can combine with DNA to produce complexes. It is an indispensable part of cell life activities, such as DNA recombination, modification, replication, virus integration and transcription. With the rapid development of gene sequencing technology and the increasing demand for sequencing technology, more and more unknown DNA-binding proteins are waiting for researchers to predict. However, develop a high quality and short time prediction method still face more challenges. In this article the author puts forward a new method named PSFM-DDT, which combines the Position Specific Frequency Matrix(PSFM) and Different Property Distance Transformation(DDT). Firstly, the evolutionary information of protein sequence was expressed by frequency matrix, and then using distance transformation of different amino acids is transformed into a series of new feature vector. The extracted vectors features are trained by using Support Vector Machine(SVM) linear kernel method to choice the last model. The accuracy of this method reached 83.16% using jackknife test on the benchmark dataset and 79.57% on the independent dataset. Through the experimental results indicated that performance of this method obtain significantly improved compared with other prediction method.