2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology最新文献

筛选
英文 中文
Issues with the PipeAlign phylogenomics toolkit in identifying protein subfamilies PipeAlign系统基因组学工具包在识别蛋白质亚家族中的问题
Christine Kehyayan, G. Butler
{"title":"Issues with the PipeAlign phylogenomics toolkit in identifying protein subfamilies","authors":"Christine Kehyayan, G. Butler","doi":"10.1109/CIBCB.2010.5510344","DOIUrl":"https://doi.org/10.1109/CIBCB.2010.5510344","url":null,"abstract":"Automated protein function annotation is extremely important in computational biology for its low cost. Standard sequence similarity comparison methods for annotation have limited specificity in identifying orthologs and paralogs. Phylogenomic methods are gaining popularity for their role in identifying orthologs and paralogs with the help of evolutionary information and sequence data. Pipelines have been developed for phylogenomic classification of proteins. Two such pipelines are PhyloFacts and PipeAlign. Given a protein of interest, these pipelines identify functional subfamilies for the protein superfamily. Subfamilies hold orthologs and paralogs and can later be used to identify orthologous groups. We evaluate the performance of PipeAlign with respect to both consistency in the generated subfamilies and phylogeny. We use the predefined subfamilies of PhyloFacts as a reference to compare the generated subfamilies of related reference sequences in PipeAlign. In the consistency analysis, we compare the compositions of the generated functional subfamilies with different related reference sequences, and use the predefined PhyloFacts subfamilies for the corresponding sequences as a measure of consistency. In the phylogenetic analysis, we compare the evolutionary distances of the members of the same and different generated subfamilies from PipeAlign.","PeriodicalId":340637,"journal":{"name":"2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125016455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Side effect machines for quaternary edit metric decoding 四元编辑码解码的副作用机
J. A. Brown, S. Houghten, D. Ashlock
{"title":"Side effect machines for quaternary edit metric decoding","authors":"J. A. Brown, S. Houghten, D. Ashlock","doi":"10.1109/CIBCB.2010.5510422","DOIUrl":"https://doi.org/10.1109/CIBCB.2010.5510422","url":null,"abstract":"DNA edit metric codes are used as labels to track the origin of sequence data. This study is the first to treat sophisticated decoders for these error-correcting codes. Side effect machines can provide efficient decoding algorithms for such codes. Two methods for automatically producing decoding algorithms are presented. Side Effect Machines (SEMs), generalizations of finite state automata, are used in both. Single Classifier Machines (SCMs) use a single side effect machine to classify all words within a code. Locking Side Effect Machines (LSEMs) use multiple side effect machines to create a tree structured iterated classification. This study examines these techniques and provides new decoders for existing codes. Presented are ideas for best practises for the creation of these two types of new edit metric decoders. Codes of the form (n,M,d)4 are used in testing due to their suitability for bioinformatics problems. A group of (12, 54–56, 7)4 codes are used as an example of the process.","PeriodicalId":340637,"journal":{"name":"2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127451514","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Additive noise analysis on microarray data via SVM classification 基于支持向量机分类的微阵列数据加性噪声分析
Z. Ding, Yanqing Zhang
{"title":"Additive noise analysis on microarray data via SVM classification","authors":"Z. Ding, Yanqing Zhang","doi":"10.1109/CIBCB.2010.5510725","DOIUrl":"https://doi.org/10.1109/CIBCB.2010.5510725","url":null,"abstract":"Microarray technology has been broadly used for monitoring the expression levels of thousands of genes simultaneously, providing the opportunities of identifying disease-related genes by finding differentially expressed genes in different conditions. However, a great challenge of analyzing microarray data is the significant noise brought by different experimental settings, laboratory procedures, genetic heterogeneity among samples, and environmental variations among different patients, and so on. This paper attempts to analyze the influence of these noises on each gene by measuring the changes of classification performance. We assume each gene in microarray data includes an independently distributed unknown uniform noise. Thus, we add a compensated noise back to each gene and test whether the classification accuracy of a linear support vector machine (SVM) improves. If the accuracy does increase, then we believe such noise does exist and degenerate the relation of this gene to the disease status. Through extensive experiments on several public microarray data, we found such added noises can improve the classification accuracy in several genes and the results are relatively consistent, indicating our method can be used to analyze the noise pattern in microarray experiments, and also discover potential important gene markers.","PeriodicalId":340637,"journal":{"name":"2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129251264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Improved prediction of transcription binding sites from chromatin modification data 利用染色质修饰数据改进转录结合位点的预测
Kengo Sato, Thomas Whitington, T. Bailey, P. Horton
{"title":"Improved prediction of transcription binding sites from chromatin modification data","authors":"Kengo Sato, Thomas Whitington, T. Bailey, P. Horton","doi":"10.1109/CIBCB.2010.5510323","DOIUrl":"https://doi.org/10.1109/CIBCB.2010.5510323","url":null,"abstract":"In this paper we apply machine learning to the task of predicting transcription factor binding sites by combining information on multiple forms of chromatin modification with the binding strength DNA site predicted by a position weight matrix. We additionally explore the effect of incorporating auxiliary features such as the distance of the site to the nearest gene's transcription start site and the degree to which the site is conserved among related species. We approach the task as a classification problem, and show that both Na¨ıve Bayes and Random Forests can provide substantial increases in the accuracy of predicted binding sites. Our results extend previous work which simply filtered candidate sites based on H3K4Me3 chromatin modification scores. In addition we apply feature selection to explore which forms of chromatin modification and which auxiliary features have predictive value for which transcription factors.","PeriodicalId":340637,"journal":{"name":"2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124351804","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved PCR design for mouse DNA by training finite state machines 通过训练有限状态机改进小鼠DNA PCR设计
S. Yadav, S. Corns
{"title":"Improved PCR design for mouse DNA by training finite state machines","authors":"S. Yadav, S. Corns","doi":"10.1109/CIBCB.2010.5510701","DOIUrl":"https://doi.org/10.1109/CIBCB.2010.5510701","url":null,"abstract":"This project presents an updated method for classification of polymerase chain reaction primers in mice using finite state classifiers. This is done to compensate for many lab, organism and chemical specific factors that are costly. Using Finite State Classifiers can help decrease the number of primers that fail to amplify correctly. For training these classifiers, five different evolutionary algorithms that use an incremental fitness reward are used. Variations to the number of generations and the values in the fitness reward are examined, and the resulting designs are presented. By controlling the fitness reward correctly, there is a potential to develop classifiers with a high likelihood of accepting only good primers. The proposed tool can act as a post-production add-on to the standard primer picking algorithm for gene expression detection in mice to compensate for local factors that may induce errors.","PeriodicalId":340637,"journal":{"name":"2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123080160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Nearest neighbor training of side effect machines for sequence classification 序列分类中副作用机的最近邻训练
D. Ashlock, Andrew McEachern
{"title":"Nearest neighbor training of side effect machines for sequence classification","authors":"D. Ashlock, Andrew McEachern","doi":"10.1109/CIBCB.2010.5510426","DOIUrl":"https://doi.org/10.1109/CIBCB.2010.5510426","url":null,"abstract":"Side effect machines operate by associating side effects with the states of a finite state machine. The use of side effect machines permits the researcher to leverage information stored in the state transition structure, making machines that might be identical as recognizers behave differently as classifiers. The side effect machines in this study associate a counter with each state so that the number of times each state is visited becomes a numerical feature associated with each state. The key to effective use of these numerical feature is to locate side effect machines for which the count vectors are good feature sets. In this study side effect machines are selected with an evolutionary algorithm. The Rand index of nearest neighbor classification of the count vectors serves as the fitness function for selecting side effect machines. A parameter study is performed on simple synthetic data and then side effect machines are trained to classify two sets of biological sequences. The first set comprises two categories of HLA sequences from the human major histocompatibility complex. The second are positive and negative examples of human endogenous retroviral sequences taken from the human genome. The retroviral sequences are challenging but good results are obtained. The HLA data is classified with complete accuracy.","PeriodicalId":340637,"journal":{"name":"2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126568034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Applying neural networks to classify influenza virus antigenic types and hosts 应用神经网络对流感病毒抗原类型和宿主进行分类
P. Attaluri, Zhengxin Chen, G. Lu
{"title":"Applying neural networks to classify influenza virus antigenic types and hosts","authors":"P. Attaluri, Zhengxin Chen, G. Lu","doi":"10.1109/CIBCB.2010.5510726","DOIUrl":"https://doi.org/10.1109/CIBCB.2010.5510726","url":null,"abstract":"Influenza viruses continue to evolve rapidly and are responsible for seasonal epidemics and occasional, but catastrophic, pandemics. We recently demonstrated the use of decision tree and support vector machine methods in classifying pandemic swine flu viral strains with high accuracy. Here, we applied the technique of artificial neural networks for the prediction of important influenza virus antigenic types (H1, H3, and H5) and hosts (Human, Avian, and Swine), which fulfills a critical need for a computational system for influenza surveillance. A comprehensive experiment on different k-mers and different binary encoding types showed classification based upon frequencies of k-mer nucleotide strings performed better than transformed binary data of nucleotides. It has been found for the first time that the accuracy of virus classification varies from host to host and from gene segment to gene segment. In particular, compared to avian and swine viruses, human influenza viruses can be classified with high accuracy, which indicates influenza virus strains might have become well adapted to their human host and hence less variation occurs in human viruses. In addition, the accuracy of host classification varies from genome segment to segment, achieving the highest values when using the HA and NA segments for human host classification. This research, along with our previous studies, shows machine learning techniques play an indispensable role in virus classification.","PeriodicalId":340637,"journal":{"name":"2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128198855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Machine learning approaches for customized docking scores: Modeling of inhibition of Mycobacterium tuberculosis enoyl acyl carrier protein reductase 定制对接分数的机器学习方法:结核分枝杆菌烯酰酰基载体蛋白还原酶抑制的建模
G. Fogel, Jonathan Tran, Stephen Johnson, David Hecht
{"title":"Machine learning approaches for customized docking scores: Modeling of inhibition of Mycobacterium tuberculosis enoyl acyl carrier protein reductase","authors":"G. Fogel, Jonathan Tran, Stephen Johnson, David Hecht","doi":"10.1109/CIBCB.2010.5510700","DOIUrl":"https://doi.org/10.1109/CIBCB.2010.5510700","url":null,"abstract":"Machine learning algorithms were used for feature selection and model generation of customized docking score functions for known inhibitors of Mycobacterium tuberculosis enoyl acyl carrier protein reductase. The features included small molecule descriptors derived from MOE, Accord, and Molegro as well as in silico docking energies/scores from GOLD and Autodock. The resulting models can be used to identify key descriptors for enoyl acyl carrier protein reductase inhibition and are useful for high-throughput screening of novel drug compounds. This paper also evaluates and contrasts several strategies for model generation for quantitative structure-activity relationships.","PeriodicalId":340637,"journal":{"name":"2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124367878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A fitness-independent evolvability measure for evolutionary developmental systems 进化发展系统的适应度独立的可进化性度量
Yaochu Jin, J. Trommler
{"title":"A fitness-independent evolvability measure for evolutionary developmental systems","authors":"Yaochu Jin, J. Trommler","doi":"10.1109/CIBCB.2010.5510475","DOIUrl":"https://doi.org/10.1109/CIBCB.2010.5510475","url":null,"abstract":"Evolvability refers to the organisms ability to create heritable new phenotypes that potentially facilitate the organism's survival and reproduction. In this paper, a general evolvability measure for a computational model of evolutionary development is proposed. The measure is able to quantify individuals' evolvability, including robustness and innovation, independent of the fitness function of the evolutionary system. Empirical studies are performed to check the evolvability of individuals in in silico evolution of oscillatory behavior using the proposed evolvability measure. Our preliminary results suggest that evolvability of the developmental system can evolve without an explicit selection pressure on evolvability, confirming findings revealed in other artificial evolutionary systems.","PeriodicalId":340637,"journal":{"name":"2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"198 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121881575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Exploring structural modeling of proteins for kernel-based enzyme discrimination 探索基于核酶识别的蛋白质结构建模
Marco A. Alvarez, Changhui Yan
{"title":"Exploring structural modeling of proteins for kernel-based enzyme discrimination","authors":"Marco A. Alvarez, Changhui Yan","doi":"10.1109/CIBCB.2010.5510588","DOIUrl":"https://doi.org/10.1109/CIBCB.2010.5510588","url":null,"abstract":"Computational methods play an important role in investigating the relationships between protein structure and function. In this study, we evaluate different graph representations of protein structures for kernel-based protein function prediction. We use shortest path graph kernels and support vector machines to predict whether a protein is an enzyme or not. We present three different and straightforward strategies for modeling protein structures. Accuracy averages for 10-fold cross-validation range from 84.31% to 86.97% for different modeling strategies, outperforming state-of-the-art work.","PeriodicalId":340637,"journal":{"name":"2010 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130008982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信