{"title":"高基序:核苷酸序列的新区分模式及其在真核生物核心启动子预测中的应用","authors":"C. Pridgeon, D. Corne","doi":"10.1109/CIBCB.2005.1594949","DOIUrl":null,"url":null,"abstract":"We approach the general problem of finding a model that discriminates between classes of nucleotide sequences. In this area, a common approach is to train a model, such as a neural network, or a hidden Markov model, to perform the discrimination, using as inputs either the raw sequences encoded in a standard form, or features derived from the raw data in a pre-processing stage. In this paper a novel discriminatory pattern structure for nucleotide sequences is introduced, called a hypermotif, and evolutionary computation is used to evolve a collection of specific hypermotifs which discriminate between classes in the data. The raw nucleotide data are then processed, transforming it into feature vectors, where the features are the individual scores on the evolved hypermotifs. Using this transformation, any classification method may then be used to build an accurate predictive model. The approach is tested on a database of eukaryotic promoters, and find that this method enables us to outperform a standard multilayer perceptron (despite using a linear discriminant as the final classifier), and provides similar performance to the best approach so far for these data (which uses a time delay neural network)","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Hypermotifs: Novel Discriminatory Patterns for Nucleotide Sequences and their Application to Core Promoter Prediction in Eukaryotes\",\"authors\":\"C. Pridgeon, D. Corne\",\"doi\":\"10.1109/CIBCB.2005.1594949\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We approach the general problem of finding a model that discriminates between classes of nucleotide sequences. In this area, a common approach is to train a model, such as a neural network, or a hidden Markov model, to perform the discrimination, using as inputs either the raw sequences encoded in a standard form, or features derived from the raw data in a pre-processing stage. In this paper a novel discriminatory pattern structure for nucleotide sequences is introduced, called a hypermotif, and evolutionary computation is used to evolve a collection of specific hypermotifs which discriminate between classes in the data. The raw nucleotide data are then processed, transforming it into feature vectors, where the features are the individual scores on the evolved hypermotifs. Using this transformation, any classification method may then be used to build an accurate predictive model. The approach is tested on a database of eukaryotic promoters, and find that this method enables us to outperform a standard multilayer perceptron (despite using a linear discriminant as the final classifier), and provides similar performance to the best approach so far for these data (which uses a time delay neural network)\",\"PeriodicalId\":330810,\"journal\":{\"name\":\"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIBCB.2005.1594949\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIBCB.2005.1594949","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hypermotifs: Novel Discriminatory Patterns for Nucleotide Sequences and their Application to Core Promoter Prediction in Eukaryotes
We approach the general problem of finding a model that discriminates between classes of nucleotide sequences. In this area, a common approach is to train a model, such as a neural network, or a hidden Markov model, to perform the discrimination, using as inputs either the raw sequences encoded in a standard form, or features derived from the raw data in a pre-processing stage. In this paper a novel discriminatory pattern structure for nucleotide sequences is introduced, called a hypermotif, and evolutionary computation is used to evolve a collection of specific hypermotifs which discriminate between classes in the data. The raw nucleotide data are then processed, transforming it into feature vectors, where the features are the individual scores on the evolved hypermotifs. Using this transformation, any classification method may then be used to build an accurate predictive model. The approach is tested on a database of eukaryotic promoters, and find that this method enables us to outperform a standard multilayer perceptron (despite using a linear discriminant as the final classifier), and provides similar performance to the best approach so far for these data (which uses a time delay neural network)