{"title":"Modeling oncology gene pathways network with multiple genotypes and phenotypes via a copula method","authors":"Le Bao, Zhou Zhu, Jingjing Ye","doi":"10.1109/CIBCB.2009.4925734","DOIUrl":"https://doi.org/10.1109/CIBCB.2009.4925734","url":null,"abstract":"Identification of interactions between molecular features (e.g. mutation, gene expression change) and gross phenotypes in diseases and other biological processes is one of the important challenges in genomic research. Popular approaches such as GSEA are limited to hypothesis tests of bivariate association. However, a specific phenotype is often dependent upon multiple molecular features. It is thus worth considering all possible interactions jointly for a more precise and realistic representation of the cellular network. In this article, a semiparametric copula model is developed to jointly model genotypes, pathways and phenotypes to accomplish this object. A two-step procedure for reconstruction of the network is described. Simulation studies indicate that the method is effective and accurate for the network reconstruction. Application using NCI60 cancer cell line data identifies several subsets of molecular features that jointly perform as the predictors of clinical phenotypes. The copula model is expected to have a broad impact on biomedical research, ranging from cancer treatment to disease prevention.","PeriodicalId":162052,"journal":{"name":"2009 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115697472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SARNA-ensemble-predict: The effect of different dissimilarity metrics on a novel ensemble-based RNA secondary structure prediction algorithm","authors":"Herbert H. Tsang, K. Wiese","doi":"10.1109/CIBCB.2009.4925701","DOIUrl":"https://doi.org/10.1109/CIBCB.2009.4925701","url":null,"abstract":"Recently, there is a resurgence of interest in the RNA secondary structure prediction problem due to the discovery of many new families of non-coding RNAs with a variety of functions. This paper describes and presents a novel algorithm for RNA secondary structure prediction based on an ensemble-based approach. An evaluation of the performance in terms of sensitivity and specificity is made. Experiments were performed on eleven structures from four RNA classes (RNaseP, Group I intron 16S rRNA, Group I intron 23S rRNA and 16S rRNA). Three RNA secondary structure similarity metrics (base pair distance, tree edit distance, and thermodynamic energy distance) and their effects on the clustering algorithm were explored. The significant contribution of this paper is in the examining of the various results from employing different dissimilarity metrics. Overall, the base pair distance dissimilarity metric shows better results with the other two distance metrics (tree edit distance and thermodynamic energy distance). The results presented in this paper demonstrate that SARNA-Ensemble-Predict can give comparable performance to a state-of-the-art algorithm Sfold in terms of sensitivity.","PeriodicalId":162052,"journal":{"name":"2009 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116371081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting protein subcellular locations for Gram-negative bacteria using neural networks ensemble","authors":"Junwei Ma, Wenqi Liu, Hong Gu","doi":"10.1109/CIBCB.2009.4925716","DOIUrl":"https://doi.org/10.1109/CIBCB.2009.4925716","url":null,"abstract":"Many species of Gram-negative bacteria are pathogenic bacteria that can cause disease in a host organism. This pathogenic capability is usually associated with certain components in Gram-negative cells, so it is highly desirable to develop an effective method to predict the Gram-negative bacterial protein subcellular locations. Reflecting the wide applications of neural networks in this field, we design seven different training functions based on Elman networks, and use a genetic algorithm to select the proper networks for an ensemble. Experimental results show that the neural networks ensemble has a dominant advantage in performance.","PeriodicalId":162052,"journal":{"name":"2009 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123945446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Statistical comparison of color model-classifier pairs in hematoxylin and eosin stained histological images","authors":"Mutlu Mete, U. Topaloglu","doi":"10.1109/CIBCB.2009.4925740","DOIUrl":"https://doi.org/10.1109/CIBCB.2009.4925740","url":null,"abstract":"Color is the most critical information for assessing histological images. However, in literature, there is no standard color space in which a particular color points are represented for computer vision tasks. In this paper, we evaluated 11 color models with three different learning schemas for their performance in classifying tumor-related colors. The color models we studied are CIELAB, CIELUV, CIEXYZ, CMY, CMYK, HSL, HSV, Hunter-LAB, NRGB, RGB, and SCT. With 11 color models, prediction accuracies of three well-known classifiers, namely SVMs, C4.5, and Naïve Bayes, are statistically compared on a large dataset of 3494 Hematoxylin and Eosin (HE) stained histopathologic images. Surprisingly, experiment results show that in contrast to general assumptions, there is no single model that is better than others in every case. However, C4.5 outperformed other two classifiers by obtaining average F-measure of 0.9989. Of 11 color models, we suggest the pair of C4.5-SCT as the most accurate classification framework for tumor identification in HE stained histological images.","PeriodicalId":162052,"journal":{"name":"2009 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121840760","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using diffusion characters for the taxonomy of self-organizing social networks","authors":"D. Ashlock, Colin Lee","doi":"10.1109/CIBCB.2009.4925708","DOIUrl":"https://doi.org/10.1109/CIBCB.2009.4925708","url":null,"abstract":"This study evolves agents to play iterated prisoners dilemma with choice and refusal. The choice and refusal mechanism causes the agents to self-organize social networks. We then apply a novel technique for inducing a pseudometric on the space of networks using diffusion characters to analyze the resulting social networks, and create an exploratory taxonomy of the social networks. The taxonomy agrees well with features visible in rendered drawing of the networks as well as with similarities in the fitness trajectories of the populations that give rise to those networks.","PeriodicalId":162052,"journal":{"name":"2009 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128569662","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. P. Carvalho, T. M. Mendes, Ricardo de Souza Ribeiro, Ricardo Fortuna, José Marcos Veneroso, M. Mudado
{"title":"A system for recognition of biological patterns in toxins using computational intelligence","authors":"B. P. Carvalho, T. M. Mendes, Ricardo de Souza Ribeiro, Ricardo Fortuna, José Marcos Veneroso, M. Mudado","doi":"10.1109/CIBCB.2009.4925717","DOIUrl":"https://doi.org/10.1109/CIBCB.2009.4925717","url":null,"abstract":"This work presents an innovative way to find biological patterns in toxins in order to classify them according to their biological functions. Basing on relevant biological information (database) it was developed a system that uses computational intelligence to discover novel patterns within the primary and secondary structures of a set of toxins. The discovered patterns make it possible to differentiate these toxins by their function: binding to specific channels for sodium, calcium or potassium ions. The classification rules are built using a given toxin database which is pre-processed according to the existence of signal peptide or propeptide in the primary sequence, together with the predicted secondary structures and its physical and chemical characteristics and water affinity information. The best obtained patterns are combined together in order to generate a final rule. All the experiments were performed using 802 toxin primary sequences labeled as channel functions obtained from two public databases, ATDB and Tox-Prot. After using the system to solve three different binary classification problems, each one for a specific ion channel, a committee is used to obtain the final classification label for each toxin. The committee got a classification accuracy of 80%, with correctness of 97%, 67% and 55% respectively to sodium, potassium and calcium channels.","PeriodicalId":162052,"journal":{"name":"2009 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127672242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Isye Arieshanti, M. Bodén, S. Maetschke, Fabian A. Buske
{"title":"Detecting sequence and structure homology via an integrative kernel: A case-study in recognizing enzymes","authors":"Isye Arieshanti, M. Bodén, S. Maetschke, Fabian A. Buske","doi":"10.1109/CIBCB.2009.4925706","DOIUrl":"https://doi.org/10.1109/CIBCB.2009.4925706","url":null,"abstract":"Sequence and structure are complementary pieces of information that can be used to infer protein function. We study and compare sequence, structure and sequence-structure integrative kernels to recognize proteins with enzymatic function. Using a support-vector machine, we show that kernels that combine sequence and structure information typically perform better (AUC 0.73) at this task than kernels that exploit either type of information exclusively. We find that the feature space of structure kernels complements that of sequence kernels, making both sources of similarity more accessible to kernel methods","PeriodicalId":162052,"journal":{"name":"2009 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121451848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Diagnostic character location within the cryptic skipper butterfly species complex with an evolutionary algorithm","authors":"D. Ashlock, T. V. Königslöw","doi":"10.1109/CIBCB.2009.4925713","DOIUrl":"https://doi.org/10.1109/CIBCB.2009.4925713","url":null,"abstract":"This study presents an evolutionary algorithm for locating DNA sequence characters that are diagnostic between closely related groups of species. The algorithm is developed using synthetic data and then tested on biological data from a species of butterfly recently discovered to be a cryptic complex of species. This technique proved to be successful in locating positions that are diagnostic of the cryptic neotropical skipper butterfly species within the cytochrome c oxidase subunit I (COI) DNA barcode data. The algorithm uses a novel subset representation to select positions within the DNA sequences. A crossover operator that takes pairs of subsets to pairs of subsets is designed. This crossover operator permits the use of a novel mutation operator that disrupts loci showing evidence of convergence, yielding better preservation of diversity in the evolving population of diagnostic character positions. A lexical (tie breaking) fitness function is used to smooth the fitness landscape. The problem of locating diagnostic positions in DNA sequences proved difficult without lexical fitness; with that innovation in place the problem is quite tractable. The evolutionary algorithm developed has the potential for broad application such as in conservation, customs enforcement, and forensics.","PeriodicalId":162052,"journal":{"name":"2009 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130350285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Protein fold recognition with adaptive local hyperplane algorithm","authors":"V. Kecman, Tao Yang","doi":"10.1109/CIBCB.2009.4925710","DOIUrl":"https://doi.org/10.1109/CIBCB.2009.4925710","url":null,"abstract":"Protein fold recognition task is important for understanding the biological functions of proteins. The adaptive local hyperplane (ALH) algorithm has been shown to perform better than many other renown classifiers including support vector machines, K-nearest neighbor, linear discriminant analysis, K-local hyperplane distance nearest neighbor algorithms and decision trees on a variety of data sets. In this paper, we apply the ALH algorithm to well-known data sets on protein fold recognition task without sequence similarity from Ding and Dubchak (2001). The results obtained demonstrate that the ALH algorithm outperforms all the seven other very well known and established benchmarking classifiers applied to same data sets.","PeriodicalId":162052,"journal":{"name":"2009 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115438594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A multi-threaded DNA tag/anti-tag library generator for multi-core platforms","authors":"A. Ravindran, Daniel J. Burns","doi":"10.1109/CIBCB.2009.4925724","DOIUrl":"https://doi.org/10.1109/CIBCB.2009.4925724","url":null,"abstract":"This paper describes a new approach to the problem of generating DNA tag/anti-tag libraries for use in biological assay methods. This approach couples multi-threaded coding methods and a highly parallel multi-population genetic algorithm to leverage performance gains made possible by the multi-core CPUs increasingly prevalent in today's commodity computers. We also describe the results of experiments characterizing the performance of this approach, which can yield up to an 8X speedup on a workstation equipped with dual quad-core CPUs. We observe that the coding effort required to implement this approach using the C language and Pthreads parallel programming model is greatly reduced compared to previous methods using both the VHDL language and reconfigurable hardware (FPGAs), and compared to C with the MPI API run on a cluster of computers.","PeriodicalId":162052,"journal":{"name":"2009 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-03-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128949564","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}