Hypermotifs: Novel Discriminatory Patterns for Nucleotide Sequences and their Application to Core Promoter Prediction in Eukaryotes

C. Pridgeon, D. Corne
{"title":"Hypermotifs: Novel Discriminatory Patterns for Nucleotide Sequences and their Application to Core Promoter Prediction in Eukaryotes","authors":"C. Pridgeon, D. Corne","doi":"10.1109/CIBCB.2005.1594949","DOIUrl":null,"url":null,"abstract":"We approach the general problem of finding a model that discriminates between classes of nucleotide sequences. In this area, a common approach is to train a model, such as a neural network, or a hidden Markov model, to perform the discrimination, using as inputs either the raw sequences encoded in a standard form, or features derived from the raw data in a pre-processing stage. In this paper a novel discriminatory pattern structure for nucleotide sequences is introduced, called a hypermotif, and evolutionary computation is used to evolve a collection of specific hypermotifs which discriminate between classes in the data. The raw nucleotide data are then processed, transforming it into feature vectors, where the features are the individual scores on the evolved hypermotifs. Using this transformation, any classification method may then be used to build an accurate predictive model. The approach is tested on a database of eukaryotic promoters, and find that this method enables us to outperform a standard multilayer perceptron (despite using a linear discriminant as the final classifier), and provides similar performance to the best approach so far for these data (which uses a time delay neural network)","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIBCB.2005.1594949","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

We approach the general problem of finding a model that discriminates between classes of nucleotide sequences. In this area, a common approach is to train a model, such as a neural network, or a hidden Markov model, to perform the discrimination, using as inputs either the raw sequences encoded in a standard form, or features derived from the raw data in a pre-processing stage. In this paper a novel discriminatory pattern structure for nucleotide sequences is introduced, called a hypermotif, and evolutionary computation is used to evolve a collection of specific hypermotifs which discriminate between classes in the data. The raw nucleotide data are then processed, transforming it into feature vectors, where the features are the individual scores on the evolved hypermotifs. Using this transformation, any classification method may then be used to build an accurate predictive model. The approach is tested on a database of eukaryotic promoters, and find that this method enables us to outperform a standard multilayer perceptron (despite using a linear discriminant as the final classifier), and provides similar performance to the best approach so far for these data (which uses a time delay neural network)
高基序:核苷酸序列的新区分模式及其在真核生物核心启动子预测中的应用
我们接近的一般问题,找到一个模型之间的核苷酸序列类的区别。在这一领域,一种常见的方法是训练一个模型,如神经网络或隐马尔可夫模型,使用以标准形式编码的原始序列或从预处理阶段的原始数据中导出的特征作为输入来执行识别。本文引入了一种新的核苷酸序列区分模式结构,称为超基序,并使用进化计算来进化出一组特定的超基序,这些超基序可以区分数据中的类别。然后对原始核苷酸数据进行处理,将其转换为特征向量,其中特征是进化的超基序上的个体得分。使用这种转换,任何分类方法都可以用来构建准确的预测模型。该方法在真核生物启动子数据库上进行了测试,并发现该方法使我们能够优于标准多层感知器(尽管使用线性判别器作为最终分类器),并且对于这些数据(使用时间延迟神经网络)提供与迄今为止最佳方法相似的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信