Proceedings. IEEE Computational Systems Bioinformatics Conference最新文献_第8页

High-throughput 3D structural homology detection via NMR resonance assignment. 通过核磁共振分配的高通量三维结构同源性检测。

Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2004-01-01

Christopher James Langmead, Bruce Randall Donald

{"title":"High-throughput 3D structural homology detection via NMR resonance assignment.","authors":"Christopher James Langmead, Bruce Randall Donald","doi":"","DOIUrl":"","url":null,"abstract":"One goal of the structural genomics initiative is the identification of new protein folds. Sequence-based structural homology prediction methods are an important means for prioritizing unknown proteins for structure determination. However, an important challenge remains: two highly dissimilar sequences can have similar folds & how can we detect this rapidly, in the context of structural genomics? High-throughput NMR experiments, coupled with novel algorithms for data analysis, can address this challenge. We report an automated procedure, called HD, for detecting 3D structural homologies from sparse, unassigned protein NMR data. Our method identifies 3D models in a protein structural database whose geometries best fit the unassigned experimental NMR data. HD does not use, and is thus not limited by sequence homology. The method can also be used to confirm or refute structural predictions made by other techniques such as protein threading or homology modelling. The algorithm runs in O(pn + pn(5/2) log (cn)+p log p) time, where p is the number of proteins in the database, n is the number of residues in the target protein and c is the maximum edge weight in an integer-weighted bipartite graph. Our experiments on real NMR data from 3 different proteins against a database of 4,500 representative folds demonstrate that the method identifies closely related protein folds, including sub-domains of larger proteins, with as little as 10-30% sequence homology between the target protein (or sub-domain) and the computed model. In particular, we report no false-negatives or false-positives despite significant percentages of missing experimental data.","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"278-89"},"PeriodicalIF":0.0,"publicationDate":"2004-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25831030","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Improved fourier transform method for unsupervised cell-cycle regulated gene prediction. 无监督细胞周期调控基因预测的改进傅立叶变换方法。

Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2004-01-01 DOI: 10.1109/csb.2004.1332433

Karuturi R Murthy, Liu Jian Hua

{"title":"Improved fourier transform method for unsupervised cell-cycle regulated gene prediction.","authors":"Karuturi R Murthy, Liu Jian Hua","doi":"10.1109/csb.2004.1332433","DOIUrl":"https://doi.org/10.1109/csb.2004.1332433","url":null,"abstract":"Motivation: Cell-cycle regulated gene prediction using microarray time-course measurements of the mRNA expression levels of genes has been used by several researchers. The popularly employed approach is Fourier transform (FT) method in conjunction with the set of known cell-cycle regulated genes. In the absence of training data, fourier transform method is sensitive to noise, additive monotonic component arising from cell population growth and deviation from strict sinusoidal form of expression. Known cell cycle regulated genes may not be available for certain organisms or using them for training may bias the prediction.Results: In this paper we propose an Improved Fourier Transform (IFT) method which takes care of several factors such as monotonic additive component of the cell-cycle expression, irregular or partial-cycle sampling of gene expression. The proposed algorithm does not need any known cell-cycle regulated genes for prediction. Apart from alleviating need for training set, it also removes bias towards genes similar to the training set. We have evaluated the developed method on two publicly available datasets: yeast cell-cycle data and HeLa cell-cycle data. The proposed algorithm has performed competitively on both datasets with that of the supervised fourier transform method used. It outperformed other unsupervised methods such as Partial Least Squares (PLS) and Single Pulse Modeling (SPM). This method is easy to comprehend and implement, and runs faster.","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"194-203"},"PeriodicalIF":0.0,"publicationDate":"2004-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2004.1332433","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25829588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automated protein classification using consensus decision. 使用共识决策的自动蛋白质分类。

Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2004-01-01 DOI: 10.1109/csb.2004.1332436

Tolga Can, Orhan Camoğlu, Ambuj K Singh, Yuan-Fang Wang

{"title":"Automated protein classification using consensus decision.","authors":"Tolga Can, Orhan Camoğlu, Ambuj K Singh, Yuan-Fang Wang","doi":"10.1109/csb.2004.1332436","DOIUrl":"https://doi.org/10.1109/csb.2004.1332436","url":null,"abstract":"We propose a novel technique for automatically generating the SCOP classification of a protein structure with high accuracy. High accuracy is achieved by combining the decisions of multiple methods using the consensus of a committee (or an ensemble) classifier. Our technique is rooted in machine learning which shows that by judicially employing component classifiers, an ensemble classifier can be constructed to outperform its components. We use two sequence- and three structure-comparison tools as component classifiers. Given a protein structure, using the joint hypothesis, we first determine if the protein belongs to an existing category (family, superfamily, fold) in the SCOP hierarchy. For the proteins that are predicted as members of the existing categories, we compute their family-, superfamily-, and fold-level classifications using the consensus classifier. We show that we can significantly improve the classification accuracy compared to the individual component classifiers. In particular, we achieve error rates that are 3-12 times less than the individual classifiers' error rates at the family level, 1.5-4.5 times less at the superfamily level, and 1.1-2.4 times less at the fold level.","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"224-35"},"PeriodicalIF":0.0,"publicationDate":"2004-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2004.1332436","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25829591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mapping of microbial pathways through constrained mapping of orthologous genes. 通过限制同源基因的定位来定位微生物途径。

Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2004-01-01 DOI: 10.1109/csb.2004.1332449

Victor Olman, Hanchuan Peng, Zhengchang Su, Ying Xu

{"title":"Mapping of microbial pathways through constrained mapping of orthologous genes.","authors":"Victor Olman, Hanchuan Peng, Zhengchang Su, Ying Xu","doi":"10.1109/csb.2004.1332449","DOIUrl":"https://doi.org/10.1109/csb.2004.1332449","url":null,"abstract":"We present a novel computer algorithm for mapping biological pathways from one prokaryotic genome to another. The algorithm maps genes in a known pathway to their homologous genes (if any) in a target genome that is most consistent with (a) predicted orthologous gene relationship, (b) predicted operon structures, and (c) predicted co-regulation relationship of operons. Mathematically, we have formulated this problem as a constrained minimum spanning tree problem (called a Steiner network problem), and demonstrated that this formulation has the desired property through applications. We have solved this mapping problem using a combinatorial optimization algorithm, with guaranteed global optimality. We have implemented this algorithm as a computer program, called PMAP. Our test results on pathway mapping are highly encouraging -- we have mapped a number of pathways of H. influenzae, B. subtilis, H. pylori, and M. tuberculosis to E. coli using P-MAP, whose homologous pathways in E coli. are known and hence the mapping accuracy could be checked. We have then mapped known E. coli pathways in the EcoCyc database to the newly sequenced organism Synechococcus sp WH8102, and predicted 158 Synechococcus pathways. Detailed analyses on the predicted pathways indicate that P-MAP's mapping results are consistent with our general knowledge about (local) pathways. We believe that P-MAP will be a useful tool for microbial genome annotation projects and inference of individual microbial pathways.","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"363-70"},"PeriodicalIF":0.0,"publicationDate":"2004-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2004.1332449","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25830000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A mixed factors model for dimension reduction and extraction of a group structure in gene expression data. 一种用于基因表达数据中群体结构降维和提取的混合因子模型。

Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2004-01-01 DOI: 10.1109/csb.2004.1332429

Ryo Yoshida, Tomoyuki Higuchi, Seiya Imoto

引用次数: 0

Fractal genomics modeling: a new approach to genomic analysis and biomarker discovery. 分形基因组建模:基因组分析和生物标志物发现的新方法。

Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2004-01-01

Sandy Shaw, Paul Shapshak

引用次数: 0

A self-tuning method for one-chip SNP identification. 一种单芯片SNP鉴定的自调谐方法。

Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2004-01-01 DOI: 10.1109/csb.2004.1332419

Michael Molla, Jude Shavlik, Todd Richmond, Steven Smith

引用次数: 0

Comparative analysis of gene sets in the Gene Ontology space under the multiple hypothesis testing framework. 多假设检验框架下基因本体空间中基因集的比较分析。

Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2004-01-01 DOI: 10.1109/csb.2004.1332455

Sheng Zhong, Lu Tian, Cheng Li, Kai-Florian Storch, Wing H Wong

{"title":"Comparative analysis of gene sets in the Gene Ontology space under the multiple hypothesis testing framework.","authors":"Sheng Zhong, Lu Tian, Cheng Li, Kai-Florian Storch, Wing H Wong","doi":"10.1109/csb.2004.1332455","DOIUrl":"https://doi.org/10.1109/csb.2004.1332455","url":null,"abstract":"The Gene Ontology (GO) resource can be used as a powerful tool to uncover the properties shared among, and specific to, a list of genes produced by high-throughput functional genomics studies, such as microarray studies. In the comparative analysis of several gene lists, researchers maybe interested in knowing which GO terms are enriched in one list of genes but relatively depleted in another. Statistical tests such as Fisher's exact test or Chi-square test can be performed to search for such GO terms. However, because multiple GO terms are tested simultaneously, individual p-values from individual tests do not serve as good indicators for picking GO terms. Furthermore, these multiple tests are highly correlated, usual multiple testing procedures that work under an independence assumption are not applicable. In this paper we introduce a procedure, based on False Discovery Rate (FDR), to treat this correlated multiple testing problem. This procedure calculates a moderately conserved estimator of q-value for every GO term. We identify the GO terms with q-values that satisfy a desired level as the significant GO terms. This procedure has been implemented into the GoSurfer software. GoSurfer is a windows based graphical data mining tool. It is freely available at http://www.gosurfer.org.","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"425-35"},"PeriodicalIF":0.0,"publicationDate":"2004-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2004.1332455","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25830006","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Segmental duplications containing tandem repeated genes encoding putative deubiquitinating enzymes. 含有串联重复基因的片段复制，编码假定的去泛素化酶。

Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2004-01-01 DOI: 10.1109/csb.2004.1332414

Hong Liu, Li Li, Asher Zilberstein, Chang S Hahn

{"title":"Segmental duplications containing tandem repeated genes encoding putative deubiquitinating enzymes.","authors":"Hong Liu, Li Li, Asher Zilberstein, Chang S Hahn","doi":"10.1109/csb.2004.1332414","DOIUrl":"https://doi.org/10.1109/csb.2004.1332414","url":null,"abstract":"Both inter- and intra-chromosomal segmental duplications are known occurred in human genome during evolution. Few cases of such segments involving functional genes have been reported. While searching for the human orthologs of murine hematopoietic deubiquitinating enzymes (DUBs), we identified four clusters of DUB-like genes on chromosome 4p15 and chromosome 8p22-23 that are over 90% identical to each other at the DNA level. These genes are expressed in a cell type- and activation-specific manner, with different clusters possessing potentially distinct expression profiles. Examining the surrounding sequences of these gene duplication events, we have identified previously unreported conserved sequence elements that are as large as 35 to 74 kb encircling the gene clusters. Traces of these elements are also found on chromosome 12p13 and chromosome 11q13. The coding and immediate upstream sequences for DUB-like genes as well as the surrounding conserved elements, are present in the chimpanzee trace database, but not in rodent genome. We hypothesize that the segments containing these DUB clusters and surrounding elements arose relatively recently in evolution through inter- and intra-chromosomal duplicative transpositions, following the divergence of primates and rodents. Genome wide systematical search of the segmental duplication containing duplicated gene cluster has been performed.","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"31-9"},"PeriodicalIF":0.0,"publicationDate":"2004-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2004.1332414","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25829770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Reasoning about molecular similarity and properties. 推理分子的相似性和性质。

Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2004-01-01 DOI: 10.1109/csb.2004.1332440

Rahul Singh

{"title":"Reasoning about molecular similarity and properties.","authors":"Rahul Singh","doi":"10.1109/csb.2004.1332440","DOIUrl":"https://doi.org/10.1109/csb.2004.1332440","url":null,"abstract":"Ascertaining the similarity amongst molecules is a fundamental problem in biology and drug discovery. Since similar molecules tend to have similar biological properties, the notion of molecular similarity plays an important role in exploration of molecular structural space, query-retrieval in molecular databases, and in structure-activity modeling. This problem is related to the issue of molecular representation. Currently, approaches with high descriptive power like 3D surface-based representations are available. However, most techniques tend to focus on 2D graph-based molecular similarity due to the complexity that accompanies reasoning with more elaborate representations. This paper addresses the problem of determining similarity when molecules are described using complex surface-based representations. It proposes an intrinsic, spherical representation that systematically maps points on a molecular surface to points on a standard coordinate system (a sphere). Molecular geometry, molecular fields, and effects due to field super-positioning can then be captured as distributions on the surface of the sphere. Molecular similarity is obtained by computing the similarity of the corresponding property distributions using a novel formulation of histogram-intersection. This method is robust to noise, obviates molecular pose-optimization, can incorporate conformational variations, and facilitates highly efficient determination of similarity. Retrieval performance, applications in structure-activity modeling of complex biological properties, and comparisons with existing research and commercial methods demonstrate the validity and effectiveness of the approach.","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"266-77"},"PeriodicalIF":0.0,"publicationDate":"2004-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2004.1332440","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25831029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0