{"title":"An iterative loop matching approach to the prediction of RNA secondary structures with pseudoknots","authors":"Jianhua Ruan, G. Stormo, Weixiong Zhang","doi":"10.1109/CSB.2003.1227394","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227394","url":null,"abstract":"In this paper we present a heuristic algorithm, iterative loop matching, for predicting RNA pseudoknots. The method can utilize either thermodynamic or comparative information or both, thus is able to predict for both aligned and individual sequences. Using 8-12 homologous sequences, the algorithm correctly identifies more than 90% of base-pairs for short sequences and 80% overall. It correctly predicts nearly all pseudoknots, while having very few false predictions. Comparisons show that our algorithm is more sensitive and more specific than existing methods. In addition, our algorithm is very efficient and can be applied to sequences up to several thousands of bases long.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"62 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126799658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Noise-attenuation in artificial genetic networks","authors":"Y. Morishita, K. Aihara","doi":"10.1109/CSB.2003.1227428","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227428","url":null,"abstract":"Dynamics of gene expressions is quite noisy because of intrinsic noise originated from the smallness of the number of related molecules. Noise-attenuation and system-stabilization in artificial genetic networks are important problems for various applications in engineering and medical areas. In this study, we propose a plausible method to control fluctuation in artificial genetic networks. The main idea is an addition of the molecules designed to specifically bind to synthesized proteins with fast equilibrium. This fast interaction between those molecules and the proteins absorbs and compensates for the variation from the average. We demonstrate that, by this method, we can stabilize not only single gene expression, but also system dynamics with multistable states.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123726619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shaun Mahony, Terry J. Smith, J. McInerney, A. Golden
{"title":"A new approach to gene prediction using the self-organizing map","authors":"Shaun Mahony, Terry J. Smith, J. McInerney, A. Golden","doi":"10.1109/CSB.2003.1227365","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227365","url":null,"abstract":"In this poster we present a gene prediction approach based on the self-organizing map that has the ability to automatically identify all the major patterns of content variation within a genome. The genome may then be scanned for regions displaying the same properties as one of these automatically identified models. Even using a relatively simple coding measure (codon usage), this method can predict the location of protein-coding sequences with a reasonably high accuracy. We also show other advantages of the approach, such as the ability to indicate genes that contain frame-shifts. We believe that this method has the potential to become a useful addition to the genome annotation process.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124623753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An evolutionary approach to finding schemas for 3-class protein secondary structure prediction","authors":"Hsiang Chi Huang","doi":"10.1109/CSB.2003.1227384","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227384","url":null,"abstract":"A genetic algorithm has been applied to predict building schemas of protein secondary structure. This research uses protein secondary data generated by DSSP. Although the average Q3 of this research is not the highest score among previous researches, some fundamental and useful building schemas of protein secondary structure have been found. The results of this study would be a valuable reference for understanding the basic building patterns of protein secondary structures.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"27 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123387403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Haplotype pattern mining & classification for detecting disease associated site","authors":"T. Kido, Masanori Baba, Hirohito Matsumine, Yoko Higashi, Hirotaka Higuchi, Masaaki Muramatsu","doi":"10.1109/CSB.2003.1227369","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227369","url":null,"abstract":"Finding the causative genes for common diseases using SNP (single nucleotide polymorphism) markers is now becoming a real challenge. Although traditional statistical SNP association tests exist, these tests could not explain the effects of SNP combinations or probable recombination histories from ancestral chromosomes. Haplotype analysis of disease associated site provides more powerful markers than individual SNP analysis, and can help identify probable causative mutations. In this paper, we introduce a new method for effective haplotype pattern mining to detect disease associated mutations. Using this procedure, we can discover some of the new disease associated SNPs, which can not be detected by traditional methods. We will introduce a powerful tool for implementing this procedure with some worked examples.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129816432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Experimental studies of the universal chemical key (UCK) algorithm on the NCI database of chemical compounds","authors":"R. Grossman, Pavan Kasturi, D. Hamelberg, B. Liu","doi":"10.1109/CSB.2003.1227324","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227324","url":null,"abstract":"We have developed an algorithm called the universal chemical key (UCK) algorithm that constructs a unique key for a molecular structure. The molecular structures are represented as undirected labeled graphs with the atoms representing the vertices of the graph and the bonds representing the edges. The algorithm was tested on 236,917 compounds obtained from the National Cancer Institute (NCI) database of chemical compounds. In this paper we present the algorithm, some examples and the experimental results on the NCI database. On the NCI database, the UCK algorithm provided distinct unique keys for chemicals with different molecular structures.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127560296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finding higher order motifs under the levenshtein measure","authors":"E. Adebiyi, Tinuke Dipe","doi":"10.1109/CSB.2003.1227414","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227414","url":null,"abstract":"We study the problem of finding higher order motifs under the levenshtein measure, otherwise known as the edit distance. In the problem set-up, we are given N sequences, each of average length n, over a finite alphabet /spl Sigma/ and thresholds D and q, we are to find composite motifs that contain motifs of length P (these motifs occur with almost D differences) in 1 /spl les/ q /spl les/ N distinct sequences. Two interesting but involved algorithms for finding higher order motifs under the edit distance was presented by Marsan and Sagot. Their second algorithm is much more complicated and its complexity is asymptotically not better. Their first algorithm runs in O(M /spl middot/ N/sup 2/n/sup 1+/spl alpha/ /spl middot/p /spl middot/ pow(/spl epsi/)/) where p /spl ges/ 2, /spl alpha/ > 0, pow(/spl epsi/) is a concave function that is less than 1, /spl epsi/= D/P and M is the expected number of all monad motifs. We present an alternative algorithmic approach also for Edit distance based on the concept described. The resulting algorithm is simpler and runs in O(N/sup 2/n/sup 1+p /spl middot/ pow(/spl epsi/)/) expected time.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128375753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Algorithms for bounded-error correlation of high dimensional data in microarray experiments","authors":"Mehmet Koyutürk, A. Grama, W. Szpankowski","doi":"10.1109/CSB.2003.1227412","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227412","url":null,"abstract":"The problem of clustering continuous valued data has been well studied in literature. Its application to microarray analysis relies on such algorithms as k-means, dimensionality reduction techniques, and graph-based approaches for building dendrograms of sample data. In contrast, similar problems for discrete-attributed data are relatively unexplored. An instance of analysis of discrete-attributed data arises in detecting co-regulated samples in microarrays. In this paper, we present an algorithm and a software framework, PROXIMUS, for error-bounded clustering of high-dimensional discrete attributed datasets in the context of extracting co-regulated samples from microarray data. We show that PROXIMUS delivers outstanding performance in extracting accurate patterns of gene-expression.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127621251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Clustering time-varying gene expression profiles using scale-space signals","authors":"T. Syeda-Mahmood","doi":"10.1109/CSB.2003.1227303","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227303","url":null,"abstract":"The functional state of an organism is determined largely by the pattern of expression of its genes. The analysis of gene expression data from gene chips has primarily revolved around clustering and classification of the data using machine learning techniques based on the intensity of expression alone with the time-varying pattern mostly ignored. In this paper, we present a pattern recognition-based approach to capturing similarity by finding salient changes in the time-varying expression patterns of genes. Such changes can give clues about important events, such as gene regulation by cell-cycle phases, or even signal the onset of a disease. Specifically, we observe that dissimilarity between time series is revealed by the sharp twists and bends produced in a higher-dimensional curve formed from the constituent signals. Scale-space analysis is used to detect the sharp twists and turns and their relative strength with respect to the component signals is estimated to form a shape similarity measure between time profiles. A clustering algorithm is presented to cluster gene profiles using the scale-space distance as a similarity metric. Multidimensional curves formed from time series within clusters are used as cluster prototypes or indexes to the gene expression database, and are used to retrieve the functionally similar genes to a query gene profile. Extensive comparison of clustering using scale-space distance in comparison to traditional Euclidean distance is presented on the yeast genome database.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127813632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Degenerate primer design via clustering","authors":"Xintao Wei, D. Kuhn, G. Narasimhan","doi":"10.1109/CSB.2003.1227306","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227306","url":null,"abstract":"This paper describes a new strategy for designing degenerate primers for a given multiple alignment of amino acid sequences. Degenerate primers are useful for amplifying homologous genes. However, when a large collection of sequences is considered, no consensus region may exist in the multiple alignment, making it impossible to design a single pair of primers for the collection. In such cases, manual methods are used to find smaller groups from the input collection so that primers can be designed for individual groups. Our strategy proposes an automatic grouping of the input sequences by using clustering techniques. Conserved regions are then detected for each individual group. Conserved regions are scored using a blocksimilarity score, a novel alignment scoring scheme that is appropriate for this application. Degenerate primers are then designed by reverse translating the conserved amino acid sequences to the corresponding nucleotide sequences. Our program, DePiCt, was written in BioPerl and was tested on the Toll-Interleukin Receptor (TIR) and the nonTIR family of plant resistance genes. Existing programs for degenerate primer design were unable to find primers for these data sets.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"53 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120877611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}