Proceedings. IEEE Computer Society Bioinformatics Conference最新文献

筛选
英文 中文
Simultaneous classification and feature clustering using discriminant vector quantization with applications to microarray data analysis 判别矢量量化同时分类和特征聚类在微阵列数据分析中的应用
Proceedings. IEEE Computer Society Bioinformatics Conference Pub Date : 2002-08-14 DOI: 10.1109/CSB.2002.1039347
Jia Li, H. Zha
{"title":"Simultaneous classification and feature clustering using discriminant vector quantization with applications to microarray data analysis","authors":"Jia Li, H. Zha","doi":"10.1109/CSB.2002.1039347","DOIUrl":"https://doi.org/10.1109/CSB.2002.1039347","url":null,"abstract":"In many applications of supervised learning, automatic feature clustering is often desirable for a better understanding of the interaction among the various features as well as the interplay between the features and the class labels. In addition, for high dimensional data sets, feature clustering has the potential for improvement in classification accuracy and reduction in computational complexity. In this paper, a method is developed for simultaneous classification and feature clustering by extending discriminant vector quantization (DVQ), a prototype classification method derived from the principle of minimum description length using source coding techniques. The method incorporates feature clustering with classification performed by fusing features in the same clusters. To illustrate its effectiveness, the method has been applied to microarray gene expression data for human lymphoma classification. It is demonstrated that incorporating feature clustering improves classification accuracy, and the clusters generated match well with biological meaningful gene expression signature groups.","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"1 1","pages":"246-255"},"PeriodicalIF":0.0,"publicationDate":"2002-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CSB.2002.1039347","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62214502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Selective tree growing: a deterministic constant-space linear-time algorithm for pattern discovery and for computing multiple sequence alignment 选择性树生长:用于模式发现和计算多序列比对的确定性常空间线性时间算法
Proceedings. IEEE Computer Society Bioinformatics Conference Pub Date : 2002-08-14 DOI: 10.1109/CSB.2002.1039367
Mashilamani Sambasivam
{"title":"Selective tree growing: a deterministic constant-space linear-time algorithm for pattern discovery and for computing multiple sequence alignment","authors":"Mashilamani Sambasivam","doi":"10.1109/CSB.2002.1039367","DOIUrl":"https://doi.org/10.1109/CSB.2002.1039367","url":null,"abstract":"Summary form only given. Given a set of n sequences, the multiple sequence alignment problem is to align these n sequences, with gaps or otherwise, such that the commonality of the sequences is projected appropriately. If m is the total sum of the lengths of the input sequences, A is the alphabet size of the input sequences, and P is the final number of unique patterns, fixed by the user, that cause an alignment between sequences, then the algorithm runs in time bound O(m(A + P)), linear worst case time. Our algorithm runs on both sequences where A is small and large. Our algorithm forms the alignment by first discovering patterns, and thus is also a pattern discovery solution. We support our theoretical conclusions with experimental results obtained from running our algorithm on GenPept sequences and human genome sequences from the GenBank public domain database. Our algorithm uses direct n-wise alignment and constant memory space irrespective of the value of m. What differentiates this algorithm from most others is that it is deterministic; it is guaranteed and theoretically proved that all patterns of any arbitrary length that occur in at least k sequences and that are responsible for multiple sequence alignment are found by the algorithm, where k is specified by the user.","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"1 1","pages":"344-"},"PeriodicalIF":0.0,"publicationDate":"2002-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CSB.2002.1039367","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62214669","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Constrained multiple sequence alignment tool development and its application to RNase family alignment 约束多序列比对工具的开发及其在RNase家族比对中的应用
Proceedings. IEEE Computer Society Bioinformatics Conference Pub Date : 2002-08-14 DOI: 10.1109/CSB.2002.1039336
C. Tang, C. Lu, M. Chang, Yin-Te Tsai, Yuh-Ju Sun, K. Chao, Jia-Ming Chang, Yu-Han Chiou, Chia-Mao Wu, Hao-Teng Chang, Wei-I Chou
{"title":"Constrained multiple sequence alignment tool development and its application to RNase family alignment","authors":"C. Tang, C. Lu, M. Chang, Yin-Te Tsai, Yuh-Ju Sun, K. Chao, Jia-Ming Chang, Yu-Han Chiou, Chia-Mao Wu, Hao-Teng Chang, Wei-I Chou","doi":"10.1109/CSB.2002.1039336","DOIUrl":"https://doi.org/10.1109/CSB.2002.1039336","url":null,"abstract":"In this paper, we design an algorithm of computing a constrained multiple sequence alignment (CMSA) for guaranteeing that the generated alignment satisfies the user-specified constraints that some particular residues should be aligned together If the number of residues needed to be aligned together is a constant a, then the time-complexity of our CMSA algorithm for aligning K sequences is /spl Oscr/(/spl alpha/Kn/sup 4/), where n is the maximum of the lengths of sequences. In addition, we have build up such a CMSA software system and made several experiments on the RNase sequences, which mainly function in catalyzing the degradation of RNA molecules. The resulting alignments illustrate the practicability of our method.","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"53 1","pages":"127-137"},"PeriodicalIF":0.0,"publicationDate":"2002-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CSB.2002.1039336","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62214684","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 57
A security system for personal genome information at DNA level DNA级别的个人基因组信息安全系统
Proceedings. IEEE Computer Society Bioinformatics Conference Pub Date : 2002-08-14 DOI: 10.1109/CSB.2002.1039353
Y. Kawazoe, Toshikazu Shiba, Masahito Yamamoto, A. Ohuchi
{"title":"A security system for personal genome information at DNA level","authors":"Y. Kawazoe, Toshikazu Shiba, Masahito Yamamoto, A. Ohuchi","doi":"10.1109/CSB.2002.1039353","DOIUrl":"https://doi.org/10.1109/CSB.2002.1039353","url":null,"abstract":"The personal information encoded in genomic DNA should not be made available to the public. With the increasing discoveries of new genes, it has become necessary to establish a security system for personal genome information. Although many security systems that are applied for electrical information in computers have been developed and established, there is no security system for information at DNA level. We describe a new security system for information encoded within DNA. The original genomic DNA was mixed with many kinds of dummy DNAs (mixtures of natural and/or artificial DNAs) resulting in the masking of the original information. Using these dummy molecules, we succeeded to completely 'lock' the original genome information. If this information must be 'unlocked', it can be extracted and analyzed by a removal of dummy DNAs using molecular tagging techniques or by selective amplification using key primers.","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"1 1","pages":"314-320"},"PeriodicalIF":0.0,"publicationDate":"2002-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CSB.2002.1039353","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62214721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Wavelet profiles: their application in Oryza sativa DNA sequence analysis 小波谱在水稻DNA序列分析中的应用
Proceedings. IEEE Computer Society Bioinformatics Conference Pub Date : 2002-08-14 DOI: 10.1109/CSB.2002.1039368
N. Kawagashira, Yasuhiro Ohtomo, K. Murakami, K. Matsubara, J. Kawai, Piero Carninci, Y. Hayashizaki, S. Kikuchi
{"title":"Wavelet profiles: their application in Oryza sativa DNA sequence analysis","authors":"N. Kawagashira, Yasuhiro Ohtomo, K. Murakami, K. Matsubara, J. Kawai, Piero Carninci, Y. Hayashizaki, S. Kikuchi","doi":"10.1109/CSB.2002.1039368","DOIUrl":"https://doi.org/10.1109/CSB.2002.1039368","url":null,"abstract":"Here we introduce our application of the wavelet analysis method to DNA sequences. In the signal processing field, Fourier transform is popular for analyzing wave data. However, although this method can process frequency information, it fails to handle locational data. In contrast, the wavelet method accommodates both locational and frequency information for wave analysis. The wavelet method is now increasing in its importance for signal processing. Fast Fourier transform is already applied to biological sequence analysis using correlations. We introduce a new method, called wavelet profile, for biological sequence analysis. Our method is based on multiresolution analysis of wavelet transform, offering data decomposition in several scaling at the same time. We applied our wavelet profile method to identifying gene loci among O. sativa genomic sequences.","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"1 1","pages":"345-346"},"PeriodicalIF":0.0,"publicationDate":"2002-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CSB.2002.1039368","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62214736","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Designing oscillators in synthetic gene networks based on multi-scale dynamics 基于多尺度动力学的合成基因网络振荡子设计
Proceedings. IEEE Computer Society Bioinformatics Conference Pub Date : 2002-08-14 DOI: 10.1109/CSB.2002.1039359
Luonan Chen, Tetsuya J. Kobayashi, K. Aihara
{"title":"Designing oscillators in synthetic gene networks based on multi-scale dynamics","authors":"Luonan Chen, Tetsuya J. Kobayashi, K. Aihara","doi":"10.1109/CSB.2002.1039359","DOIUrl":"https://doi.org/10.1109/CSB.2002.1039359","url":null,"abstract":"Multistability, oscillations, and switching exist at various levels of biological processes and organizations and have been investigated on the basis of many theoretical models, such as circadian oscillations with the period protein (PER) and the timeless protein (TIM) in Drosophila, and multistable dynamics regulated by transcriptional factors. Considerable experimental evidence suggests that cellular processes are intrinsically rhythmic or periodic. Various periodic oscillations with different time scales ranging from less than a second to more than a year, which may allow for living organisms to adapt their behaviors to a periodically varying environment, have also been observed experimentally. On the other hand, in synthetic gene networks, both toggle switch and repressilator have been theoretically proposed and further confirmed by experiments. All of these works stress the importance of feedback regulation of transcriptional factors, which is a key in giving rise to oscillatory or multistable dynamical behaviors exhibited by biological genetic systems. In addition, it should be noted that many periodic behaviors do not simply oscillate smoothly; rather, they change rapidly or jump at certain states. In gene expression systems, many different time scales characterize the gene regulatory processes. For instance, the transcription and translation processes generally evolve on a time scale that is much slower than that of phosphorylation, dimerization or binding reactions of transcription factors. In genetic networks, the time scale for expression of some genes is much slower than that of others, depending on the length of the genes. We aim to design robust periodic oscillators in synthetic gene-protein systems by simple nonlinear models and to analyze the basic mechanism of limit cycles with jumping behaviors or relaxation oscillations by exploiting multiple time-scale properties [1, 2]. We show that periodic oscillations are mainly generated by nonlinear feedback loops in gene regulatory systems and the jumping dynamics caused by time scale differences among biochemical reactions. Moreover, effects of time delay are also examined. We show that time delay generally enlarges the stability region of oscillations, thereby making the oscillations more sustainable despite parameter changes or noise [1, 2]. The dynamics of the proposed models is robust in terms of stability and period length to the parameter perturbations or environment variations. Although we mainly analyze some specific models, the mechanisms identified in this work are likely to apply to a variety of genetic regulatory systems. These simple models may actually act as basic building block in synthetic gene-protein networks, such as genetic oscillators or switches because the dynamics is robust for parameter perturbations or environment variations. Several examples are also provided to demonstrate implementation of synthetic oscillators by using genes of the /spl lambda/ phage bact","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"1 1","pages":"336-"},"PeriodicalIF":0.0,"publicationDate":"2002-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CSB.2002.1039359","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62214378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Electronic polymerase chain reaction (EPCR) search algorithm 电子聚合酶链反应(EPCR)搜索算法
Proceedings. IEEE Computer Society Bioinformatics Conference Pub Date : 2002-08-14 DOI: 10.1109/CSB.2002.1039361
Conrad Shyu, J. Foster, L. Forney
{"title":"Electronic polymerase chain reaction (EPCR) search algorithm","authors":"Conrad Shyu, J. Foster, L. Forney","doi":"10.1109/CSB.2002.1039361","DOIUrl":"https://doi.org/10.1109/CSB.2002.1039361","url":null,"abstract":"We developed an integer-encoding scheme and a search algorithm for in silico PCR (polymerase chain reaction) amplification that identifies sequence homology with the specified primers and enzymes. Unlike the traditional character-based approach, the EPCR algorithm developed represents DNA sequences as four integer variables. The bit streams in each integer variable reflect the occurrences of nucleotides (A, T C, G) in the sequence. This approach exploits the fact that there are only four possible nucleotides in either DNA or RNA. A sequence of 32 nucleotides therefore can be reduced to four integers. In addition, since nucleotides are individually represented by four integer variables, ambiguities in the sequence (e.g., \"N\") can be fully resolved and encoded within the four integers.","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"1 1","pages":"338-"},"PeriodicalIF":0.0,"publicationDate":"2002-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CSB.2002.1039361","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62214466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Automated identification of single nucleotide polymorphisms from sequencing data 从测序数据中自动识别单核苷酸多态性
Proceedings. IEEE Computer Society Bioinformatics Conference Pub Date : 2002-08-14 DOI: 10.1109/CSB.2002.1039332
Masazumi Takahashi, F. Matsuda, N. Margetic, M. Lathrop
{"title":"Automated identification of single nucleotide polymorphisms from sequencing data","authors":"Masazumi Takahashi, F. Matsuda, N. Margetic, M. Lathrop","doi":"10.1109/CSB.2002.1039332","DOIUrl":"https://doi.org/10.1109/CSB.2002.1039332","url":null,"abstract":"Single nucleotide polymorphisms (SNPs) provide abundant information about genetic variation. Large scale discovery of high frequency SNPs is being undertaken using various methods. However, the publicly available SNP data are not always accurate, and therefore should be verified. If only a particular gene locus is concerned, locus-specific polymerase chain reaction amplification may be useful. Problem of this method is that the secondary peak has to be measured. We have analyzed trace data from conventional sequencing equipment and found an applicable rule to discern SNPs from noise. We have developed software that integrates this function to automatically identify SNPs. The software works accurately for high quality sequences and also can detect SNPs in low quality sequences. Further, it can determine allele frequency, display this information as a bar graph and assign corresponding nucleotide combinations. It is very useful for identifying de novo SNPs in a DNA fragment of interest.","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"1 1","pages":"87-93"},"PeriodicalIF":0.0,"publicationDate":"2002-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CSB.2002.1039332","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62214539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 77
Protein-based analysis of alternative splicing in the human genome 人类基因组中选择性剪接的蛋白质分析
Proceedings. IEEE Computer Society Bioinformatics Conference Pub Date : 2002-08-14 DOI: 10.1109/CSB.2002.1039335
A. Loraine, G. Helt, M. Cline, Michael A. Siani-Rose
{"title":"Protein-based analysis of alternative splicing in the human genome","authors":"A. Loraine, G. Helt, M. Cline, Michael A. Siani-Rose","doi":"10.1109/CSB.2002.1039335","DOIUrl":"https://doi.org/10.1109/CSB.2002.1039335","url":null,"abstract":"Understanding the functional significance of alternative splicing and other mechanisms that generate RNA transcript diversity is an important challenge facing modern-day molecular biology. Using homology-based, protein sequence analysis methods, it should be possible to investigate how transcript diversity impacts protein structure and function. To test this, a data mining technique (\"DiffHit\") was developed to identify and catalog genes producing protein isoforms which exhibit distinct profiles of conserved protein motifs. We found that out of a test set of over 1,300 alternatively spliced genes with solved genomic structure, over 30% exhibited a differential profile of conserved InterPro and/or Blocks protein motifs across distinct isoforms. These results suggest that motif databases such as Blocks and InterPro are potentially useful tools for investigating how alternative transcript structure affects gene function.","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"1 1","pages":"118-124"},"PeriodicalIF":0.0,"publicationDate":"2002-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CSB.2002.1039335","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62214615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
AxML: a fast program for sequential and parallel phylogenetic tree calculations based on the maximum likelihood method 基于最大似然方法的顺序和并行系统发育树计算的快速程序
Proceedings. IEEE Computer Society Bioinformatics Conference Pub Date : 2002-08-14 DOI: 10.1109/CSB.2002.1039325
A. Stamatakis, T. Ludwig, H. Meier, Marty J. Wolf
{"title":"AxML: a fast program for sequential and parallel phylogenetic tree calculations based on the maximum likelihood method","authors":"A. Stamatakis, T. Ludwig, H. Meier, Marty J. Wolf","doi":"10.1109/CSB.2002.1039325","DOIUrl":"https://doi.org/10.1109/CSB.2002.1039325","url":null,"abstract":"Heuristics for the NP-complete problem of calculating the optimal phylogenetic tree for a set of aligned rRNA sequences based on the maximum likelihood method are computationally expensive. In most existing algorithms, the tree evaluation and branch length optimization functions, calculating the likelihood value for each tree topology examined in the search space, account for the greatest part of the overall computation time. This paper introduces AxML, a program derived from fastDNAml, incorporating a fast topology evaluation junction. The algorithmic optimizations introduced, represent a general approach for accelerating this function and are applicable to both sequential and parallel phylogeny programs, irrespective of their search space strategy. Therefore, their integration into three existing phylogeny programs rendered encouraging results. Experimental results on conventional processor architectures show a global run time improvement of 35% up to 47% for the various test sets and program versions we used.","PeriodicalId":87204,"journal":{"name":"Proceedings. IEEE Computer Society Bioinformatics Conference","volume":"1 1","pages":"21-28"},"PeriodicalIF":0.0,"publicationDate":"2002-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/CSB.2002.1039325","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"62214050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信