Proceedings. IEEE Computational Systems Bioinformatics Conference最新文献

筛选
英文 中文
Consensus genetic maps: a graph theoretic approach. 共识遗传图谱:一种图论方法。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.26
Benjamin N Jackson, Srinivas Aluru, Patrick S Schnable
{"title":"Consensus genetic maps: a graph theoretic approach.","authors":"Benjamin N Jackson,&nbsp;Srinivas Aluru,&nbsp;Patrick S Schnable","doi":"10.1109/csb.2005.26","DOIUrl":"https://doi.org/10.1109/csb.2005.26","url":null,"abstract":"<p><p>A genetic map is an ordering of genetic markers constructed from genetic linkage data for use in linkage studies and experimental design. While traditional methods have focused on constructing maps from a single population study, increasingly maps are generated for multiple lines and populations of the same organism. For example, in crop plants, where the genetic variability is high, researchers have created maps for many populations. In the face of these new data, we address the increasingly important problem of generating a consensus map - an ordering of all markers in the various population studies. In our method, each input map is treated as a partial order on a set of markers. To find the most consistent order shared between maps, we model the partial orders as directed graphs. We create an aggregate by merginging the transitive closure of the input graphs and taking the transitive reduction of the result. In this process, cycles may need to be broken to resolve inconsistencies between the inputs. The cycle breaking problem is NP-hard, but the problem size depends upon the scope of the inconsistency between the input graphs, which will be local if the input graphs are from closely related organisms. We present results of running the resulting software on maps generated from seven populations of the crop plant Zea Mays.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"35-43"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.26","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25830569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Islands of tractability for parsimony haplotyping. 简约单倍型易于处理的岛屿。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.37
Roded Sharan, Bjarni V Halldósson, Sorin Istrail
{"title":"Islands of tractability for parsimony haplotyping.","authors":"Roded Sharan,&nbsp;Bjarni V Halldósson,&nbsp;Sorin Istrail","doi":"10.1109/csb.2005.37","DOIUrl":"https://doi.org/10.1109/csb.2005.37","url":null,"abstract":"<p><p>We study the parsimony approach to haplotype inference, which calls for finding a set of haplotypes of minimum cardinality that explains an input set of genotypes. We prove that the problem is APX-hard even in very restricted cases. On the positive side, we identify islands of tractability for the problem, by focusing on instances with specific structure of haplotype sharing among the input genotypes. We exploit the structure of those instance to give polynomial and constant-approximation algorithms to the problem. We also show that the general parsimony haplotyping problem is fixed parameter tractable.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"65-72"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.37","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25830575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Choosing SNPs using feature selection. 使用特征选择选择snp。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.22
Tu Minh Phuong, Zhen Lin, Russ B Altman
{"title":"Choosing SNPs using feature selection.","authors":"Tu Minh Phuong,&nbsp;Zhen Lin,&nbsp;Russ B Altman","doi":"10.1109/csb.2005.22","DOIUrl":"https://doi.org/10.1109/csb.2005.22","url":null,"abstract":"<p><p>A major challenge for genomewide disease association studies is the high cost of genotyping large number of single nucleotide polymorphisms (SNP). The correlations between SNPs, however, make it possible to select a parsimonious set of informative SNPs, known as \"tagging\" SNPs, able to capture most variation in a population. Considerable research interest has recently focused on the development of methods for finding such SNPs. In this paper, we present an efficient method for finding tagging SNPs. The method does not involve computation-intensive search for SNP subsets but discards redundant SNPs using a feature selection algorithm. In contrast to most existing methods, the method presented here does not limit itself to using only correlations between SNPs in local groups. By using correlations that occur across different chromosomal regions, the method can reduce the number of globally redundant SNPs. Experimental results show that the number of tagging SNPs selected by our method is smaller than by using block-based methods.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"301-9"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.22","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25829326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 89
Robust and accurate cancer classification with gene expression profiling. 稳健和准确的癌症分类与基因表达谱。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.49
Haifeng Li, Keshu Zhang, Tao Jiang
{"title":"Robust and accurate cancer classification with gene expression profiling.","authors":"Haifeng Li,&nbsp;Keshu Zhang,&nbsp;Tao Jiang","doi":"10.1109/csb.2005.49","DOIUrl":"https://doi.org/10.1109/csb.2005.49","url":null,"abstract":"<p><p>Robust and accurate cancer classification is critical in cancer treatment. Gene expression profiling is expected to enable us to diagnose tumors precisely and systematically. However, the classification task in this context is very challenging because of the curse of dimensionality and the small sample size problem. In this paper, we propose a novel method to solve these two problems. Our method is able to map gene expression data into a very low dimensional space and thus meets the recommended samples to features per class ratio. As a result, it can be used to classify new samples robustly with low and trustable (estimated) error rates. The method is based on linear discriminant analysis (LDA). However, the conventional LDA requires that the within-class scatter matrix S(w) be nonsingular. Unfortunately, Sw is always singular in the case of cancer classification due to the small sample size problem. To overcome this problem, we develop a generalized linear discriminant analysis (GLDA) that is a general, direct, and complete solution to optimize Fisher's criterion. GLDA is mathematically well-founded and coincides with the conventional LDA when S(w) is nonsingular. Different from the conventional LDA, GLDA does not assume the nonsingularity of S(w), and thus naturally solves the small sample size problem. To accommodate the high dimensionality of scatter matrices, a fast algorithm of GLDA is also developed. Our extensive experiments on seven public cancer datasets show that the method performs well. Especially on some difficult instances that have very small samples to genes per class ratios, our method achieves much higher accuracies than widely used classification methods such as support vector machines, random forests, etc.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"310-21"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.49","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25830776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
A pivoting algorithm for metabolic networks in the presence of thermodynamic constraints. 存在热力学约束的代谢网络的旋转算法。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.6
R Nigam, S Liang
{"title":"A pivoting algorithm for metabolic networks in the presence of thermodynamic constraints.","authors":"R Nigam,&nbsp;S Liang","doi":"10.1109/csb.2005.6","DOIUrl":"https://doi.org/10.1109/csb.2005.6","url":null,"abstract":"<p><p>A linear programming algorithm is presented to constructively compute thermodynamically feasible fluxes and change in chemical potentials of reactions for a metabolic network. It is based on physical laws of mass conservation and the second law of thermodynamics that all chemical reactions should satisfy. As a demonstration, the algorithm has been applied to the core metabolic pathway of E. coli.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"259-67"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25829322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
On optimizing distance-based similarity search for biological databases. 基于距离的生物数据库相似度搜索优化研究。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.42
Rui Mao, Weijia Xu, Smriti Ramakrishnan, Glen Nuckolls, Daniel P Miranker
{"title":"On optimizing distance-based similarity search for biological databases.","authors":"Rui Mao,&nbsp;Weijia Xu,&nbsp;Smriti Ramakrishnan,&nbsp;Glen Nuckolls,&nbsp;Daniel P Miranker","doi":"10.1109/csb.2005.42","DOIUrl":"https://doi.org/10.1109/csb.2005.42","url":null,"abstract":"<p><p>Similarity search leveraging distance-based index structures is increasingly being used for both multimedia and biological database applications. We consider distance-based indexing for three important biological data types, protein k-mers with the metric PAM model, DNA k-mers with Hamming distance and peptide fragmentation spectra with a pseudo-metric derived from cosine distance. To date, the primary driver of this research has been multimedia applications, where similarity functions are often Euclidean norms on high dimensional feature vectors. We develop results showing that the character of these biological workloads is different from multimedia workloads. In particular, they are not intrinsically very high dimensional, and deserving different optimization heuristics. Based on MVP-trees, we develop a pivot selection heuristic seeking centers and show it outperforms the most widely used corner seeking heuristic. Similarly, we develop a data partitioning approach sensitive to the actual data distribution in lieu of median splits.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"351-61"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.42","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25830780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
An algebraic geometry approach to protein structure determination from NMR data. 从核磁共振数据测定蛋白质结构的代数几何方法。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.11
Lincong Wang, Ramgopal R Mettu, Bruce Randall Donald
{"title":"An algebraic geometry approach to protein structure determination from NMR data.","authors":"Lincong Wang,&nbsp;Ramgopal R Mettu,&nbsp;Bruce Randall Donald","doi":"10.1109/csb.2005.11","DOIUrl":"https://doi.org/10.1109/csb.2005.11","url":null,"abstract":"<p><p>Our paper describes the first provably-efficient algorithm for determining protein structures de novo, solely from experimental data. We show how the global nature of a certain kind of NMR data provides quantifiable complexity-theoretic benefits, allowing us to classify our algorithm as running in polynomial time. While our algorithm uses NMR data as input, it is the first polynomial-time algorithm to compute high-resolution structures de novo using any experimentally-recorded data, from either NMR spectroscopy or X-Ray crystallography. Improved algorithms for protein structure determination are needed, because currently, the process is expensive and time-consuming. For example, an area of intense research in NMR methodology is automated assignment of nuclear Overhauser effect (NOE) restraints, in which structure determination sits in a tight inner-loop (cycle) of assignment/refinement. These algorithms are very time-consuming, and typically require a large cluster. Thus, algorithms for protein structure determination that are known to run in polynomial time and provide guarantees on solution accuracy are likely to have great impact in the long-term. Methods stemming from a technique called \"distance geometry embedding\" do come with provable guarantees, but the NP-hardness of these problem formulations implies that in the worst case these techniques cannot run in polynomial time. We are able to avoid the NP-hardness by (a) some mild assumptions about the protein being studied, (b) the use of residual dipolar couplings (RDCs) instead of a dense network of NOEs, and (c) novel algorithms and proofs that exploit the biophysical geometry of (a) and (b), drawing on a variety of computer science, computational geometry, and computational algebra techniques. In our algorithm, RDC data, which gives global restraints on the orientation of internuclear bond vectors, is used in conjunction with very sparse NOE data to obtain a polynomial-time algorithm for protein structure determination. An implementation of our algorithm has been applied to 6 different real biological NMR data sets recorded for 3 proteins. Our algorithm is combinatorially precise, polynomial-time, and uses much less NMR data to produce results that are as good or better than previous approaches in terms of accuracy of the computed structure as well as running time. In practice approaches such as restrained molecular dynamics and simulated annealing, which lack both combinatorial precision and guarantees on running time and solution quality, are commonly used. Our results show that by using a different \"slice\" of the data, an algorithm that is polynomial time and that has guarantees about solution quality can be obtained. We believe that our techniques can be extended and generalized for other structure-determination problems such as computing side-chain conformations and the structure of nucleic acids from experimental data.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"235-46"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.11","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25829320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Multi-scale hierarchical structure prediction of helical transmembrane proteins. 螺旋跨膜蛋白的多尺度层次结构预测。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.41
Zhong Chen, Ying Xu
{"title":"Multi-scale hierarchical structure prediction of helical transmembrane proteins.","authors":"Zhong Chen,&nbsp;Ying Xu","doi":"10.1109/csb.2005.41","DOIUrl":"https://doi.org/10.1109/csb.2005.41","url":null,"abstract":"<p><p>As the first step toward a multi-scale, hierarchical computational approach for membrane protein structure prediction, the packing of transmembrane helices was modeled at the residual and atomistic levels, respectively. For predictions at the residual level, the helix-helix and helix-lipid interactions were described by a set of knowledge-based energy functions. For predictions at the atomistic level, CHARMM19 force field was employed. To facilitate the system to overcome energy barriers, Wang-Landau sampling was carried out by performing a random walk in the energy and conformational spaces. Native-like structures were predicted at both levels for 2- and 7-helix systems. Interestingly, consistent results were obtained from simulations at residual and atomistic levels for the same system, strongly suggesting the feasibility of a hierarchical approach for membrane structure prediction.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"203-7"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.41","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25829946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PSIST: indexing protein structures using suffix trees. 使用后缀树对蛋白质结构进行索引。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.46
Feng Gao, Mohammed J Zaki
{"title":"PSIST: indexing protein structures using suffix trees.","authors":"Feng Gao,&nbsp;Mohammed J Zaki","doi":"10.1109/csb.2005.46","DOIUrl":"https://doi.org/10.1109/csb.2005.46","url":null,"abstract":"<p><p>Approaches for indexing proteins, and for fast and scalable searching for structures similar to a query structure have important applications such as protein structure and function prediction, protein classification and drug discovery. In this paper, we developed a new method for extracting the local feature vectors of protein structures. Each residue is represented by a triangle, and the correlation between a set of residues is described by the distances between Calpha atoms and the angles between the normals of planes in which the triangles lie. The normalized local feature vectors are indexed using a suffix tree. For all query segments, suffix trees can be used effectively to retrieve the maximal matches, which are then chained to obtain alignments with database proteins. Similar proteins are selected by their alignment score against the query. Our results shows classification accuracy up to 97.8% and 99.4% at the superfamily and class level according to the SCOP classification, and shows that on average 7.49 out of 10 proteins from the same superfamily are obtained among the top 10 matches. These results are competitive with the best previous methods.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"212-22"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.46","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25829948","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
TreeRefiner: a tool for refining a multiple alignment on a phylogenetic tree. TreeRefiner:一种在系统发育树上精炼多重比对的工具。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.53
Aswath Manohar, Serafim Batzoglou
{"title":"TreeRefiner: a tool for refining a multiple alignment on a phylogenetic tree.","authors":"Aswath Manohar,&nbsp;Serafim Batzoglou","doi":"10.1109/csb.2005.53","DOIUrl":"https://doi.org/10.1109/csb.2005.53","url":null,"abstract":"<p><p>We present TreeRefiner, a tool for refining multiple alignments of biological sequences. Given a multiple alignment, a phylogenetic tree, and scoring parameters as input, TreeRefiner optimizes the sum-of-pairs function in a restricted three-dimensional space around the alignment. At each internal node of the unrooted tree, the multiple alignment is projected to the sub-alignments corresponding to the three neighboring nodes, and three-dimensional dynamic programming is performed within a user-specified radius r around the original alignment. We test TreeRefiner on simulated sequences aligned by several popular tools, and demonstrate substantial improvements in the percentage of correctly aligned positions.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"111-9"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.53","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25829996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信