{"title":"An Effective Promoter Detection Method using the Adaboost Algorithm","authors":"Xudong Xie, Shuanhu Wu, K. Lam, Hong Yan","doi":"10.1142/9781860947995_0007","DOIUrl":null,"url":null,"abstract":"In this paper, an effective promoter detection algorithm, which is called PromoterExplorer, is proposed. In our approach, various features, i.e. local distribution of pentamers, positional CpG island features and digitized DNA sequence, are combined to build a high-dimensional input vector. A cascade AdaBoost based learning procedure is adopted to select the most “informative” or “discriminating” features to build a sequence of weak classifiers. A number of weak classifiers construct a strong classifier, which can achieve a better performance. In order to reduce the false positive, a cascade structure is used for detection. PromoterExplorer is tested based on large-scale DNA sequences from different databases, including EPD, Genbank and human chromosome 22. The proposed method consistently outperforms PromoterInspector and Dragon Promoter Finder.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":"64 1","pages":"37-46"},"PeriodicalIF":0.0000,"publicationDate":"2007-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... Asia-Pacific bioinformatics conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/9781860947995_0007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In this paper, an effective promoter detection algorithm, which is called PromoterExplorer, is proposed. In our approach, various features, i.e. local distribution of pentamers, positional CpG island features and digitized DNA sequence, are combined to build a high-dimensional input vector. A cascade AdaBoost based learning procedure is adopted to select the most “informative” or “discriminating” features to build a sequence of weak classifiers. A number of weak classifiers construct a strong classifier, which can achieve a better performance. In order to reduce the false positive, a cascade structure is used for detection. PromoterExplorer is tested based on large-scale DNA sequences from different databases, including EPD, Genbank and human chromosome 22. The proposed method consistently outperforms PromoterInspector and Dragon Promoter Finder.