Proceedings. IEEE Computer Society Bioinformatics Conference最新文献

筛选
英文 中文
A personalized and automated dbSNP surveillance system. 个性化、自动化dbSNP监控系统。
Shuo Liu, Steve Lin, Mark Woon, Teri E Klein, Russ B Altman
{"title":"A personalized and automated dbSNP surveillance system.","authors":"Shuo Liu,&nbsp;Steve Lin,&nbsp;Mark Woon,&nbsp;Teri E Klein,&nbsp;Russ B Altman","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The development of high throughput techniques and large-scale studies in the biological sciences has given rise to an explosive growth in both the volume and types of data available to researchers. A surveillance system that monitors data repositories and reports changes helps manage the data overload. We developed a dbSNP surveillance system (URL: http://www.pharmgkb.org/do/serve?id=tools.surveillance.dbsnp) that performs surveillance on the dbSNP database and alerts users to new information. The system is notable because it is personalized and fully automated. Each registered user has a list of genes to follow and receives notification of new entries concerning these genes. The system integrates data from dbSNP, LocusLink, PharmGKB, and Genbank to position SNPs on reference sequences and classify SNPs into categories such as synonymous and non-synonymous SNPs. The system uses data warehousing, object model-based data integration, object-oriented programming, and a platform-neutral data access mechanism.</p>","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"2 ","pages":"132-6"},"PeriodicalIF":0.0,"publicationDate":"2003-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25834350","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A probabilistic model for identifying protein names and their name boundaries. 识别蛋白质名称及其名称边界的概率模型。
Kazuhiro Seki, Javed Mostafa
{"title":"A probabilistic model for identifying protein names and their name boundaries.","authors":"Kazuhiro Seki,&nbsp;Javed Mostafa","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This paper proposes a method for identifying protein names in biomedical texts with an emphasis on detecting protein name boundaries. We use a probabilistic model which exploits several surface clues characterizing protein names and incorporates word classes for generalization. In contrast to previously proposed methods, our approach does not rely on natural language processing tools such as part-of-speech taggers and syntactic parsers, so as to reduce processing overhead and the potential number of probabilistic parameters to be estimated. A notion of certainty is also proposed to improve precision for identification. We implemented a protein name identification system based on our proposed method, and evaluated the system on real-world biomedical texts in conjunction with the previous work. The results showed that overall our system performs comparably to the state-of-the-art protein name identification system and that higher performance is achieved for compound names. In addition, it is demonstrated that our system can further improve precision by restricting the system output to those names with high certainties.</p>","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"2 ","pages":"251-8"},"PeriodicalIF":0.0,"publicationDate":"2003-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25834887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient constrained multiple sequence alignment with performance guarantee. 高效约束多序列对齐,保证性能。
Francis Y Chin, N L Ho, T W Lam, Prudence W Wong, M Y Chan
{"title":"Efficient constrained multiple sequence alignment with performance guarantee.","authors":"Francis Y Chin,&nbsp;N L Ho,&nbsp;T W Lam,&nbsp;Prudence W Wong,&nbsp;M Y Chan","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The Constrained Multiple Sequence Alignment problem is to align a set of sequences subject to a given constrained sequence, which arises from some knowledge of the structure of the sequences. This paper presents new algorithms for this problem, which are more efficient in terms of time and space (memory) than the previous algorithms [14], and with a worst-case guarantee on the quality of the alignment. Saving the space requirement by a quadratic factor is particularly significant as the previous O(n(4))-space algorithm has limited application due to its huge memory requirement. Experiments on real data sets confirm that our new algorithms show improvements in both alignment quality and resource requirements.</p>","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"2 ","pages":"337-46"},"PeriodicalIF":0.0,"publicationDate":"2003-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25834135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prokaryote phylogeny without sequence alignment: from avoidance signature to composition distance. 无序列比对的原核生物系统发育:从回避特征到组成距离。
Bailin Hao, Ji Qi
{"title":"Prokaryote phylogeny without sequence alignment: from avoidance signature to composition distance.","authors":"Bailin Hao,&nbsp;Ji Qi","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>A new and essentially simple method to reconstruct prokaryotic phylogenetic trees from their complete genome data without using sequence alignment is proposed. It is based on the appearance frequency of oligopeptides of a fixed length (up to K = 6) in their proteomes. This is a method without fine adjustment and choice of genes. It can incorporate the effect of lateral gene transfer to some extent and leads to results comparable with the bacteriologists' systematics as reflected in the latest 2001 edition of the Bergey's Manual of Systematic Bacteriology [1, 2]. A key point in our approach is subtraction of a random background by using a Markovian model of order K - 1 from the composition vectors to highlight the shaping role of natural selection.</p>","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"2 ","pages":"375-84"},"PeriodicalIF":0.0,"publicationDate":"2003-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25834139","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards index-based similarity search for protein structure databases. 基于索引的蛋白质结构数据库相似性搜索研究。
Orhan Camoğlu, Tamer Kahveci, Ambuj K Singh
{"title":"Towards index-based similarity search for protein structure databases.","authors":"Orhan Camoğlu,&nbsp;Tamer Kahveci,&nbsp;Ambuj K Singh","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We propose two methods for finding similarities in protein structure databases. Our techniques extract feature vectors on triplets of SSEs (Secondary Structure Elements) of proteins. These feature vectors are then indexed using a multidimensional index structure. Our first technique considers the problem of finding proteins similar to a given query protein in a protein dataset. This technique quickly finds promising proteins using the index structure. These proteins are then aligned to the query protein using a popular pairwise alignment tool such as VAST. We also develop a novel statistical model to estimate the goodness of a match using the SSEs. Our second technique considers the problem of joining two protein datasets to find an all-to-all similarity. Experimental results show that our techniques improve the pruning time of VAST 3 to 3.5 times while keeping the sensitivity similar.</p>","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"2 ","pages":"148-58"},"PeriodicalIF":0.0,"publicationDate":"2003-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25834352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Local similarity in RNA secondary structures. RNA二级结构的局部相似性。
Matthias Höchsmann, Thomas Töller, Robert Giegerich, Stefan Kurtz
{"title":"Local similarity in RNA secondary structures.","authors":"Matthias Höchsmann,&nbsp;Thomas Töller,&nbsp;Robert Giegerich,&nbsp;Stefan Kurtz","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We present a systematic treatment of alignment distance and local similarity algorithms on trees and forests. We build upon the tree alignment algorithm for ordered trees given by Jiang et. al (1995) and extend it to calculate local forest alignments, which is essential for finding local similar regions in RNA secondary structures. The time complexity of our algorithm is O(|F(1)| |F(2) deg(F(1)) deg(F(2)) (deg(F(1)) + deg(F(2))) where |F(i)| is the number of nodes in forest F(i) and deg (F(i)) is the degree of F(i). We provide carefully engineered dynamic programming implementations using dense, two-dimensional tables which considerably reduces the space requirement. We suggest a new representation of RNA secondary structures as forests that allow reasonable scoring of edit operations on RNA secondary structures. The comparison of RNA secondary structures is facilitated by a new visualization technique for RNA secondary structure alignments. Finally, we show how potential regulatory motifs can be discovered solely by their structural preservation, and independent of their sequence conservation and position.</p>","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"2 ","pages":"159-68"},"PeriodicalIF":0.0,"publicationDate":"2003-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25834353","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Clustering binary fingerprint vectors with missing values for DNA array data analysis. 基于缺失值的二值指纹向量聚类分析。
Andres Figueroa, James Borneman, Tao Jiang
{"title":"Clustering binary fingerprint vectors with missing values for DNA array data analysis.","authors":"Andres Figueroa,&nbsp;James Borneman,&nbsp;Tao Jiang","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Oligonucleotide fingerprinting is a powerful DNA array based method to characterize cDNA and ribosomal RNA gene (rDNA) libraries and has many applications including gene expression profiling and DNA clone classification. We are especially interested in the latter application. A key step in the method is the cluster analysis of fingerprint data obtained from DNA array hybridization experiments. Most of the existing approaches to clustering use (normalized) real intensity values and thus do not treat positive and negative hybridization signals equally (positive signals are much more emphasized). In this paper, we consider a discrete approach. Fingerprint data are first normalized and binarized using control DNA clones. Because there may exist unresolved (or missing) values in this binarization process, we formulate the clustering of (binary) oligonucleotide fingerprints as a combinatorial optimization problem that attempts to identify clusters and resolve the missing values in the fingerprints simultaneously. We study the computational complexity of this clustering problem and a natural parameterized version, and present an efficient greedy algorithm based on MINIMUM CLIQUE PARTITION on graphs. The algorithm takes advantage of some unique properties of the graphs considered here, which allow us to efficiently find the maximum cliques as well as some special maximal cliques. Our experimental results on simulated and real data demonstrate that the algorithm runs faster and performs better than some popular hierarchical and graph-based clustering methods. The results on real data from DNA clone classification also suggest that this discrete approach is more accurate than clustering methods based on real intensity values, in terms of separating clones that have different characteristics with respect to the given oligonucleotide probes.</p>","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"2 ","pages":"38-47"},"PeriodicalIF":0.0,"publicationDate":"2003-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25834431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast and sensitive probe selection for DNA chips using jumps in matching statistics. 利用匹配统计跳变快速灵敏地选择DNA芯片探针。
Sven Rahmann
{"title":"Fast and sensitive probe selection for DNA chips using jumps in matching statistics.","authors":"Sven Rahmann","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The design of large scale DNA microarrays is a challenging problem. So far, probe selection algorithms must trade the ability to cope with large scale problems for a loss of accuracy in the estimation of probe quality. We present an approach based on jumps in matching statistics that combines the best of both worlds. This article consists of two parts. The first part is theoretical. We introduce the notion of jumps in matching statistics between two strings and derive their properties. We estimate the frequency of jumps for random strings in a non-uniform Bernoulli model and present a new heuristic argument to find the center of the length distribution of the longest substring that two random strings have in common. The results are generalized to near-perfect matches with a small number of mismatches. In the second part, we use the concept of jumps to improve the accuracy of the longest common factor approach for probe selection by moving from a string-based to an energy-based specificity measure, while only slightly more than doubling the selection time.</p>","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"2 ","pages":"57-64"},"PeriodicalIF":0.0,"publicationDate":"2003-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25834433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Degenerate primer design via clustering. 通过聚类简化引物设计。
Xintao Wei, David N Kuhn, Giri Narasimhan
{"title":"Degenerate primer design via clustering.","authors":"Xintao Wei,&nbsp;David N Kuhn,&nbsp;Giri Narasimhan","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This paper describes a new strategy for designing degenerate primers for a given multiple alignment of amino acid sequences. Degenerate primers are useful for amplifying homologous genes. However, when a large collection of sequences is considered, no consensus region may exist in the multiple alignment, making it impossible to design a single pair of primers for the collection. In such cases, manual methods are used to find smaller groups from the input collection so that primers can be designed for individual groups. Our strategy proposes an automatic grouping of the input sequences by using clustering techniques. Conserved regions are then detected for each individual group. Conserved regions are scored using a BlockSimilarity score, a novel alignment scoring scheme that is appropriate for this application. Degenerate primers are then designed by reverse translating the conserved amino acid sequences to the corresponding nucleotide sequences. Our program, DePiCt, was written in BioPerl and was tested on the Toll-Interleukin Receptor (TIR)and the non-TIR family of plant resistance genes. Existing programs for degenerate primer design were unable to find primers for these data sets.</p>","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"2 ","pages":"75-83"},"PeriodicalIF":0.0,"publicationDate":"2003-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25834435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fast and accurate probe selection algorithm for large genomes. 大基因组快速准确的探针选择算法。
Wing-Kin Sung, Wah-Heng Lee
{"title":"Fast and accurate probe selection algorithm for large genomes.","authors":"Wing-Kin Sung,&nbsp;Wah-Heng Lee","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The oligo microarray (DNA chip) technology in recent years has a significant impact on genomic study. Many fields such as gene discovery, drug discovery, toxicological research and disease diagnosis, will certainly benefit from its use. A microarray is an orderly arrangement of thousands of DNA fragments where each DNA fragment is a probe (or a fingerprint) of a gene/cDNA. It is important that each probe must uniquely associate with a particular gene/cDNA. Otherwise, the performance of the microarray will be affected. Existing algorithms usually select probes using the criteria of homogeneity, sensitivity, and specificity. Moreover, they improve efficiency employing some heuristics. Such approaches reduce the accuracy. Instead, we make use of some smart filtering techniques to avoid redundant computation while maintaining the accuracy. Based on the new algorithm, optimal short (20 bases) or long (50 or 70 bases) probes can be computed efficiently for large genomes.</p>","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"2 ","pages":"65-74"},"PeriodicalIF":0.0,"publicationDate":"2003-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25834434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信