Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003最新文献

筛选
英文 中文
Text pattern visualization for analysis of biology full text and captions 用于生物学全文和标题分析的文本模式可视化
Andrea Grimes, R. Futrelle
{"title":"Text pattern visualization for analysis of biology full text and captions","authors":"Andrea Grimes, R. Futrelle","doi":"10.1109/CSB.2003.1227434","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227434","url":null,"abstract":"Large textbanks comprised of thousands of full-text biology papers are rapidly becoming available. We describe an approach to characterize all major language patterns in biology text in terms of frameworks. Frameworks are \"containers\" made up of common phrases surrounding specific informational items such as gene and protein names. A framework viewer has been developed that shows similar text frameworks aligned on the screen much as biosequence visualization tools do. Using the viewer, it is evident that frameworks have the power to find the types of structures needed to develop useful information retrieval systems. As a simple example, one framework was able to concisely select 45,000 nouns from a corpus of 5 million words without error. This work points the way to highly automated systems that will be able to extract and index information in biology textbanks. Work in progress includes extensions to characterize recursive structures in text, subsystems to retrieve figures in papers, and the discovery of semantic relations to aid concept-based retrieval.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121399896","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A new method for predicting RNA secondary structure 预测RNA二级结构的新方法
Hirotoshi Taira, Tomonori Izumitani, Eisaku Maeda, Takeshi Suzuki
{"title":"A new method for predicting RNA secondary structure","authors":"Hirotoshi Taira, Tomonori Izumitani, Eisaku Maeda, Takeshi Suzuki","doi":"10.1109/CSB.2003.1227395","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227395","url":null,"abstract":"It has become clear recently that there are many RNAs that are not translated into proteins, instead they work as functional molecules. These RNAs are called \"noncoding RNAs.\" Predicting the secondary structure of these RNAs is important for understanding their functions. We focus on Nussinov's algorithm and the SCFG version of Nussinov's algorithm as useful techniques for predicting RNA secondary structures. We introduce a new scoring table and loop length restriction to improve these algorithms, and the improved algorithms provided better levels of performance than the originals.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125668671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The sea urchin endomesoderm gene regulatory network, an encoded logic map for early development 海胆内胚层基因调控网络,早期发育的编码逻辑图谱
E. Davidson
{"title":"The sea urchin endomesoderm gene regulatory network, an encoded logic map for early development","authors":"E. Davidson","doi":"10.1109/CSB.2003.1227292","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227292","url":null,"abstract":"The regulatory program for animal development is \"wired\" into the cis-regulatory networks of the genome. At the DNA level these networks consist of the clusters of cis-regulatory transcription factor target sites (modules) that direct the spatial and temporal expression of each phase of activity of those genes; and the linkages amongst them. Here \"linkage\" refers to the relation between genes encoding transcription factors and the target sites of genes which they control, and between the genes encoding elements of signal systems and their ultimate target genes. We are engaged in an effort to define at the genomic sequence level the gene regulatory network (GRN) controlling endomesoderm specification in sea urchin embryos. The proposed GRN is based on the following information: spatial and temporal expression patterns of all genes included; constraints from experimental embryology; and a massive perturbation analysis in which expression of every relevant gene is perturbed and the effects on all other genes measured. The GRN consists of predicted inputs into the cis-regulatory modules controlling the endomesoderm expression of the genes involved. These predictions can be tested at the cis-regulatory level: here several such tests are presented. They show that the perturbation analysis is a surprisingly informative predictor of DNA sequence-level regulatory transactions.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128144140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reconstruction of ancient operons from complete microbial genome sequences 从完整的微生物基因组序列中重建古代操纵子
Yuhong Wang, J. Rose, Bi-Cheng Wang, Dawei Lin
{"title":"Reconstruction of ancient operons from complete microbial genome sequences","authors":"Yuhong Wang, J. Rose, Bi-Cheng Wang, Dawei Lin","doi":"10.1109/CSB.2003.1227383","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227383","url":null,"abstract":"Completed genomes not only provide DNA sequence information, but also reveal the relative locations of genes. In this paper, we propose a new method for reconstruction of \"ancient operons\" by taking advantages of the evolutionary information in both orthologous genes and their locations in a genome. The basic assumption is that the closer two genes were in an ancient genome, the more likely they will stay close in the current genome. An assembly of nonrandom neighboring pairs of genes in current genomes should be able to reconstruct the gene groups that were together at a certain point of time during evolution. Given the fact that genes that are close neighbors are more likely functionally related, the gene groups generated by this assembly process are named \"ancient operons\". The assembly is only meaningful when enough nonrandom pairs can be found. This was made possible by over 100 microbial genomes available in recent years. For proof of concept, we chose 63 nonredundant complete microbial genomes from RefSeq database [May 2003 release} at NCBI. In order to normalize the effect of protein sequence mutations and other changes due to evolution, we only consider assembly of COGs (cluster of orthologous group) in these genomes. There are total 4901 COGs from NCBI COG database are used. The assembly process is similar to the one that assembles DNA sequences into contigs. In our case, the neighbor COG pairs are used as basic assembly units. A target Junction is defined based on neighbor frequency of pair-wise link among all 4901 COGs after analysis for all 63 genomes. We used random cost algorithm, a global optimization algorithm to minimize the target function and assembled COGs into contigs. The significance of these contigs are then assessed by statistical methods. The results suggest that the assembled contigs are statistically and biologically significant. This method and the assembled ancient operons provides a new way for studying microbial genomes, their evolution and for annotating proteins of unknown functions.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130334497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Magebuilder: a schema translation tool for generating MAGE-ML from tabular microarray data Magebuilder:一个模式转换工具,用于从表格微阵列数据生成MAGE-ML
Bill Martin, R. Horton
{"title":"Magebuilder: a schema translation tool for generating MAGE-ML from tabular microarray data","authors":"Bill Martin, R. Horton","doi":"10.1109/CSB.2003.1227359","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227359","url":null,"abstract":"A 'Magebuilder' object takes a set of 'Magemap' objects and a set of data streams as input, and produces a MAGEstk object representation, which is then serialized as MAGE-ML. A 'Magemap' object encapsulates the rules of how data records from an input stream relate to one MAGE object. Each input 'stream' is an anonymous subroutine that supplies records whose fields represent columns in the input table. The input tables can be delimited text files, database queries, or essentially any source that can be coerced into a set of records with fixed fields.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"2012 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132058223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Tomato expression database (TED) - an interactive management tool for tomato expression profiling data 番茄表达数据库(TED) -番茄表达分析数据的交互式管理工具
Z. Fei, Xuemei Tang, R. Alba, P. Payton, J. Giovannoni
{"title":"Tomato expression database (TED) - an interactive management tool for tomato expression profiling data","authors":"Z. Fei, Xuemei Tang, R. Alba, P. Payton, J. Giovannoni","doi":"10.1109/CSB.2003.1227354","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227354","url":null,"abstract":"Expression data for approximately 12000 ESTs over a time course of tomato fruit development was generated. In order to provide the research community access to our normalized microarray data as a tool to assess relative expression of genes of interest, we developed a publicly accessible online database - tomato expression database TED: http://ted.bti.comell.edu). Through this database, we provide multiple approaches to pursue analysis of specific genes of interest and/or access the larger microarray data set to identify sets of genes that may behave in a pattern of interest to the user. A set of useful data mining and data visualization tools were developed and are under continuing expansion according to user's requirements. Developed initially as a data mining and analysis resource, TED also contains comprehensive annotation of each EST including homology derived from sequence similarity searches of GenBank and GO terms assigned manually according to putative functions.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121369710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Clustering binary fingerprint vectors with missing values for DNA array data analysis 基于缺失值的二值指纹向量聚类分析
A. Figueroa, J. Borneman, Tao Jiang
{"title":"Clustering binary fingerprint vectors with missing values for DNA array data analysis","authors":"A. Figueroa, J. Borneman, Tao Jiang","doi":"10.1109/CSB.2003.1227302","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227302","url":null,"abstract":"Oligonucleotide fingerprinting is a powerful DNA array based method to characterize cDNA and ribosomal RNA gene (rDNA) libraries and has many applications including gene expression profiling and DNA clone classification. We are especially interested in the latter application. A key step in the method is the cluster analysis of fingerprint data obtained from DNA array hybridization experiments. Most of the existing approaches to clustering use (normalized) real intensity values and thus do not treat positive and negative hybridization signals equally (positive signals are much more emphasized). In this paper, we consider a discrete approach. Fingerprint data are first normalized and binarized using control DNA clones. Because there may exist unresolved (or missing) values in this binarization process, we formulate the clustering of (binary) oligonucleotide fingerprints as a combinatorial optimization problem that attempts to identify clusters and resolve the missing values in the fingerprints simultaneously. We study the computational complexity of this clustering problem and a natural parameterized version, and present an efficient greedy algorithm based on minimum clique partition on graphs. The algorithm takes advantage of some unique properties of the graphs considered here, which allow us to efficiently find the maximum cliques as well as some special maximal cliques. Our experimental results on simulated and real data demonstrate that the algorithm runs faster and performs better than some popular hierarchical and graph-based clustering methods. The results on real data from DNA clone classification also suggest that this discrete approach is more accurate than clustering methods based on real intensity values, in terms of separating clones that have different characteristics with respect to the given oligonucleotide probes.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115847657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Anticlustal: multiple sequence alignment by antipole clustering and linear approximate 1-median computation 反簇:通过反极聚类和线性近似1-中位数计算实现多序列对齐
C. Pietro, A. Ferro, G. Pigola, A. Pulvirenti, M. Purrello, M. Ragusa, D. Shasha
{"title":"Anticlustal: multiple sequence alignment by antipole clustering and linear approximate 1-median computation","authors":"C. Pietro, A. Ferro, G. Pigola, A. Pulvirenti, M. Purrello, M. Ragusa, D. Shasha","doi":"10.1109/CSB.2003.1227333","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227333","url":null,"abstract":"In this paper we present a new multiple sequence alignment (MSA) algorithm called AntiClustAl. The method makes use of the commonly used idea of aligning homologous sequences belonging to classes generated by some clustering algorithm, and then continue the alignment process in a bottom-up way along a suitable tree structure. The final result is then read at the root of the tree. Multiple sequence alignment in each cluster makes use of the progressive alignment with the 1-median (center) of the cluster. The 1-median of set S of sequences is the element of S which minimizes the average distance from any other sequence in S. Its exact computation requires quadratic time. The basic idea of our proposed algorithm is to make use of a simple and natural algorithmic technique based on randomised tournaments which has been successfully applied to large size search problems in general metric spaces. In particular a clustering algorithm called antipole tree and an approximate linear 1-median computation are used. Our algorithm compared with Clustal W, a widely used tool to MSA, shows a better running time results with fully comparable alignment quality. A successful biological application showing high amino acid conservation during evolution of Xenopus laevis SOD2 is also cited.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125113087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
A method for tight clustering: with application to microarray 一种紧密聚类方法及其在微阵列中的应用
G. Tseng, W. Wong
{"title":"A method for tight clustering: with application to microarray","authors":"G. Tseng, W. Wong","doi":"10.1109/CSB.2003.1227343","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227343","url":null,"abstract":"In this paper we propose a method for clustering that produces tight and stable clusters without forcing all points into clusters. Many existing clustering algorithms have been applied in microarray data to search for gene clusters with similar expression patterns. However, none has provided a way to deal with an essential feature of array data: many genes are expressed sporadically and do not belong to any of the significant biological functions (clusters) of interest. In fact, most current algorithms aim to assign all genes into clusters. For many biological studies, however, we are mainly interested in the most informative, tight and stable clusters with sizes of, say, 20-60 genes for farther investigation. Tight Clustering has been developed specifically to address this problem. The tightest and most stable clusters are identified in a sequential manner through an analysis of the tendency of genes to be grouped together under repeated resampling. We validated this method in the expression profiles of the Drosophila life cycle. The result is shown to better serve biological needs in microarray analysis.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127192066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Alignment-free sequence comparison with vector quantization and hidden Markov models 用矢量量化和隐马尔可夫模型进行无对齐序列比较
T. Pham
{"title":"Alignment-free sequence comparison with vector quantization and hidden Markov models","authors":"T. Pham","doi":"10.1109/CSB.2003.1227399","DOIUrl":"https://doi.org/10.1109/CSB.2003.1227399","url":null,"abstract":"We introduce the concept of multiresolutions using vector quantization and hidden Markov models as a basis for alignment-free comparison of sequences. Different similarity measures can be discovered at different resolutions of the two sequences. The proposed approach provides a new aspect for studying the complexity of biological data and is effective for real-time processing.","PeriodicalId":147883,"journal":{"name":"Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003","volume":"142 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114544952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信