{"title":"Using a Bayesian Posterior Density in the Design of Perturbation Experiments for Network Reconstruction","authors":"A. Almudevar, P. Salzman","doi":"10.1109/CIBCB.2005.1594920","DOIUrl":"https://doi.org/10.1109/CIBCB.2005.1594920","url":null,"abstract":"Gene perturbation experiments are commonly used in the reconstruction of gene regulatory networks. Because such experiments are often difficult to perform, it is important to predict on a rational basis those experiments likely to result in the greatest resolution of model uncertainty. When a method for constructing Bayesian posterior densities on the space of network models is available, this provides a means with which to estimate the expected reduction in entropy that would result from a given perturbation experiment. We define an algorithm for selecting perturbation experiments based on this idea, and demonstrate it using a simulation study using a Bayesian network model.","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114870761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Efficient Approach to Detect a Protein Community from a Seed","authors":"D. Wu, Xiaohua Hu","doi":"10.1109/CIBCB.2005.1594909","DOIUrl":"https://doi.org/10.1109/CIBCB.2005.1594909","url":null,"abstract":"Community structure is a topological property common to many networks. We present in this paper an efficient and accurate approach to detecting a community in a protein-protein interaction network from a given seed protein. Our experimental results show strong structural and functional relationships among member proteins within each of the communities identified by our approach, as verified by MIPS complex catalogue database and annotations.","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124891220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Heuristic Algorithm for Computing Reversal Distance with MultiGene Families via Binary Integer Programming","authors":"J. Suksawatchon, C. Lursinsap, M. Bodén","doi":"10.1109/CIBCB.2005.1594916","DOIUrl":"https://doi.org/10.1109/CIBCB.2005.1594916","url":null,"abstract":"Hannenhalli and Pevzner developed the first polynomial-time algorithm for the combinatorial problem of sorting of signed genomic data. Their algorithm solves the minimum number of reversals required for rearranging a genome to another when gene duplication is nonexisting. In this paper, we show how to extend the Hannenhalli-Pevzner approach to genomes with multigene families. We propose a new heuristic algorithm to compute the reversal distance between two genomes with multigene families via the concept of binary integer programming without removing gene duplicates. The experimental results on simulated and real biological data demonstrate that the proposed algorithm is able to find the reversal distance accurately.","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122967287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Biological Sequence Prediction using General Fuzzy Automata","authors":"M. Doostfatemeh, S. C. Kremer","doi":"10.1109/CIBCB.2005.1594947","DOIUrl":"https://doi.org/10.1109/CIBCB.2005.1594947","url":null,"abstract":"This paper shows how the newly developed paradigm of General Fuzzy Automata (GFA) can be used as a biological sequence predictor. We consider the positional correlations of amino acids in a protein family as the basic criteria for prediction and classification of unknown sequences. It will be shown how the GFA formalism can be used as an efficient tool for classification of protein sequences. The results show that this approach predicts the membership of an unknown sequence in a protein family better than profile Hidden Markov Models (HMMs) which are now a popular and putative approach in biological sequence analysis.","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123671146","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reverse Engineering Non-Linear Gene Regulatory Networks Based on the Bacteriophage λ cI Circuit","authors":"J. Supper, C. Spieth, A. Zell","doi":"10.1109/CIBCB.2005.1594936","DOIUrl":"https://doi.org/10.1109/CIBCB.2005.1594936","url":null,"abstract":"The ability to measure the transcriptional response of cells has drawn much attention to the underlying transcriptional networks. To untangle the network, numerous models with corresponding reverse engineering methods have been applied. In this work, we propose a non-linear model with adjustable degrees of complexity. The corresponding reverse engineering method uses a probabilistic scheme to reduce the reconstruction problem to subnetworks. Adequate models for gene regulatory networks must be anchored on sufficient biological knowledge. Here, the cI auto-inhibition circuit (cI circuit) is used to validate our reverse engineering method. Simulations of the cI circuit are used for the reconstruction, whereas a simplified cI circuit model assists the modeling phase. Several levels of complexity are evaluated, subsequently the reconstructed models show different properties. As a result, we reconstruct an abstract model, capturing the dynamic behavior of the cI circuit to a high degree.","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"113 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122551751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Pireddu, B. Poulin, D. Szafron, P. Lu, D. Wishart
{"title":"Pathway Analyst Automated Metabolic Pathway Prediction","authors":"L. Pireddu, B. Poulin, D. Szafron, P. Lu, D. Wishart","doi":"10.1109/CIBCB.2005.1594924","DOIUrl":"https://doi.org/10.1109/CIBCB.2005.1594924","url":null,"abstract":"Metabolic pathways are crucial to our understanding of biology. The speed at which new organisms are being sequenced is outstripping our ability to experimentally determine their metabolic pathway information. In recent years several initiatives have been successful in automating the annotations of individual proteins in these organisms, either experimentally or by prediction. However, to leverage the success of metabolic pathways we need to automate their identification in our rapidly growing list of sequenced organisms. We present a prototype system for predicting the catalysts of important reactions and for organizing the predicted catalysts and reactions into previously defined metabolic pathways. We compare a variety of predictors that incorporate sequence similarity (BLAST), hidden Markov models (HMM) and Support Vector Machines (SVM). We found that there is an advantage to using different predictors for different reactions. We validate our prototype on 10 metabolic pathways across 13 organisms for which we obtained a cross-validation precision of 71.5% and recall of 91.5% in predicting the catalyst proteins of all reactions.","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129447228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hypermotifs: Novel Discriminatory Patterns for Nucleotide Sequences and their Application to Core Promoter Prediction in Eukaryotes","authors":"C. Pridgeon, D. Corne","doi":"10.1109/CIBCB.2005.1594949","DOIUrl":"https://doi.org/10.1109/CIBCB.2005.1594949","url":null,"abstract":"We approach the general problem of finding a model that discriminates between classes of nucleotide sequences. In this area, a common approach is to train a model, such as a neural network, or a hidden Markov model, to perform the discrimination, using as inputs either the raw sequences encoded in a standard form, or features derived from the raw data in a pre-processing stage. In this paper a novel discriminatory pattern structure for nucleotide sequences is introduced, called a hypermotif, and evolutionary computation is used to evolve a collection of specific hypermotifs which discriminate between classes in the data. The raw nucleotide data are then processed, transforming it into feature vectors, where the features are the individual scores on the evolved hypermotifs. Using this transformation, any classification method may then be used to build an accurate predictive model. The approach is tested on a database of eukaryotic promoters, and find that this method enables us to outperform a standard multilayer perceptron (despite using a linear discriminant as the final classifier), and provides similar performance to the best approach so far for these data (which uses a time delay neural network)","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129550114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Hybrid CI-Based Knowledge Discovery System on Microarray Gene Expression Data","authors":"Yuchun Tang, Yuanchen He, Yanqing Zhang, Zhen Huang, Xiaohua Hu, Rajshekhar Sunderraman","doi":"10.1109/CIBCB.2005.1594894","DOIUrl":"https://doi.org/10.1109/CIBCB.2005.1594894","url":null,"abstract":"A hybrid Computational Intelligence-based Knowledge Discovery system is presented in this paper. The system works in three phases. In phase 1, many feature selection algorithms are utilized to select informative cancer-related genes from microarray expression data. Compared with other algorithms, our GSVM-RFE algorithm demonstrates superior performance on the microarray expression dataset for AML/ALL classification. Specifically, a compact “ perfect” gene subset is reported. In phase 2, many intelligent computation models are implemented to extract useful knowledge about functions of selected genes to regulate the cancer being studied. Knowledge can ease further biomedical study because of reliable information sources, high prediction accuracy, and easiness to interpret. Currently, knowledge is represented in two formats, Web-based and Rule-based. As a future work, we plan to implement knowledge fusion algorithms in phase 3 to synthesize and consolidate hybrid knowledge into a single knowledge base to provide effective and efficient decision support for cancer diagnosis and drug discovery.","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121221362","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Depth Annotation of RNA Folds for Secondary Structure Motif Search","authors":"D. Ashlock, J. Schonfeld","doi":"10.1109/CIBCB.2005.1594896","DOIUrl":"https://doi.org/10.1109/CIBCB.2005.1594896","url":null,"abstract":"The biological activity of RNA depends on the way it folds into secondary structures. Presented here is a framework for exploratory motif searching in the space of RNA secondary structures. A collection of RNA sequences, suspected of having a particular biological activity, is fragmented into overlapping pieces of a uniform size. Each piece is folded and the details of the fold are used to annotate the primary structure. Distances between annotated structures are computed. The distance matrix for the structures is then projected into the Euclidean plane for visualization and detection of clusters. A motif is taken to be a cluster in the two dimensional space. An instance of the framework is implemented for testing on a data set containing examples of the Iron Response Element in the following manner. Folding is performed with the Mfold package. A depth-of-fold that records stems and loops onto the primary sequence is used to annotate the pieces of RNA. Dynamic programming is used to find distances between pieces of annotated primary sequence. An evolutionary algorithm is then used to find a one-to-one mapping of pieces of RNA to points in the plane that has acceptable distortion of the distances found with dynamic programming. This one-to-one mapping is a form of non-linear projection that optimizes for fidelity of projected distances to the distances derived from the Iron Response Element data set.","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127147254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Selected String Representation for Whole Genomes","authors":"Xiaomeng Wu, Guohui Lin","doi":"10.1109/CIBCB.2005.1594905","DOIUrl":"https://doi.org/10.1109/CIBCB.2005.1594905","url":null,"abstract":"The increase in the amount of available genomic data has made phylogenetic analysis possible at the whole genome scale. However, such a huge amount of data imposes computational challenges in both memory consumption and CPU usage. One novel proposal in this paper is to extract sequence patterns that are biologically meaningful. Using these patterns, whole genomes can be mapped into a significantly lower dimensional space and subsequent studies using these representations become computationally feasible. Experiments on two datasets of 64 vertebrate mitochondrial genomes and 99 prokaryote whole genomes demonstrate that the selected sequence patterns result in good quality evolutionary distances in terms of the final phylogeny.","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127920770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}