{"title":"Prediction of Protein Coding Regions by Support Vector Machine","authors":"Guo Shuo, Yi-sheng Zhu","doi":"10.1109/IUCE.2009.141","DOIUrl":null,"url":null,"abstract":"With the exponential growth of genomic sequences, there is an increasing demand to accurately identify protein coding regions from genomic sequences. Despite many progresses being made in the identification of protein coding regions by computational methods during recent years, the performances and efficiencies of the prediction methods still need to be improved. A novel method to predict the position of coding regions is proposed. First, a support vector machine is used as a classifier to recognize the first nucleotide of a codon in a coding region. Then, according to the difference of the time frequency characteristics of the output values of the classifier analyzed by Short Time Fourier Transform, the position of coding regions can be accurately determinate. The algorithm is not only can predict coding regions, but also can identify the first nucleotide of the codon in coding regions. This is very significant for accurate translation into a protein sequence. The simulation results show the proposed method is more effective for coding regions prediction than the existing coding region discovery tools.","PeriodicalId":153560,"journal":{"name":"2009 International Symposium on Intelligent Ubiquitous Computing and Education","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 International Symposium on Intelligent Ubiquitous Computing and Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IUCE.2009.141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
With the exponential growth of genomic sequences, there is an increasing demand to accurately identify protein coding regions from genomic sequences. Despite many progresses being made in the identification of protein coding regions by computational methods during recent years, the performances and efficiencies of the prediction methods still need to be improved. A novel method to predict the position of coding regions is proposed. First, a support vector machine is used as a classifier to recognize the first nucleotide of a codon in a coding region. Then, according to the difference of the time frequency characteristics of the output values of the classifier analyzed by Short Time Fourier Transform, the position of coding regions can be accurately determinate. The algorithm is not only can predict coding regions, but also can identify the first nucleotide of the codon in coding regions. This is very significant for accurate translation into a protein sequence. The simulation results show the proposed method is more effective for coding regions prediction than the existing coding region discovery tools.