Proceedings. IEEE Computational Systems Bioinformatics Conference最新文献

筛选
英文 中文
Peptide charge state determination for low-resolution tandem mass spectra. 低分辨率串联质谱中多肽电荷态的测定。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.44
Aaron A Klammer, Christine C Wu, Michael J MacCoss, William Stafford Noble
{"title":"Peptide charge state determination for low-resolution tandem mass spectra.","authors":"Aaron A Klammer,&nbsp;Christine C Wu,&nbsp;Michael J MacCoss,&nbsp;William Stafford Noble","doi":"10.1109/csb.2005.44","DOIUrl":"https://doi.org/10.1109/csb.2005.44","url":null,"abstract":"<p><p>Mass spectrometry is a particularly useful technology for the rapid and robust identification of peptides and proteins in complex mixtures. Peptide sequences can be identified by correlating their observed tandem mass spectra (MS/MS) with theoretical spectra of peptides from a sequence database. Unfortunately, to perform this search the charge of the peptide must be known, and current chargestate- determination algorithms only discriminate singlyfrom multiply-charged spectra: distinguishing +2 from +3, for example, is unreliable. Thus, search software is forced to search multiply-charged spectra multiple times. To minimize this inefficiency, we present a support vector machine (SVM) that quickly and reliably classifies multiplycharged spectra as having either a +2 or +3 precursor peptide ion. By classifying multiply-charged spectra, we obtain a 40% reduction in search time while maintaining an average of 99% of peptide and 99% of protein identifications originally obtained from these spectra.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"175-85"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.44","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25829944","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
An efficient algorithm for Perfect Phylogeny Haplotyping. 完美系统发育单倍型的高效算法。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.12
Ravi Vijayasatya, Amar Mukherjee
{"title":"An efficient algorithm for Perfect Phylogeny Haplotyping.","authors":"Ravi Vijayasatya,&nbsp;Amar Mukherjee","doi":"10.1109/csb.2005.12","DOIUrl":"https://doi.org/10.1109/csb.2005.12","url":null,"abstract":"<p><p>The Perfect Phylogeny Haplotyping (PPH) problem is one of the many computational approaches to the Haplotype Inference (HI) problem. Though there are many O(nm(2)) solutions to the PPH problem, the complexity of the PPH problem itself has remained an open question. In this paper, We introduce the FlexTree data structure that represents all the solutions for a PPH instance. We also introduce row-ordering that arranges the genotypes in a more manageable fashion. The column ordering, the FlexTree data structure and the row ordering together make the O(nm) OPPH algorithm possible. We also present some results on simulated data which demonstrate that the OPPH algorithm performs quiet impressively when compared to the earlier O(nm(2)) algorithms.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"103-10"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.12","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25829995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Gene teams with relaxed proximity constraint. 具有宽松接近约束的基因团队。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.33
Sun Kim, Jeong-Hyeon Choi, Jiong Yang
{"title":"Gene teams with relaxed proximity constraint.","authors":"Sun Kim,&nbsp;Jeong-Hyeon Choi,&nbsp;Jiong Yang","doi":"10.1109/csb.2005.33","DOIUrl":"https://doi.org/10.1109/csb.2005.33","url":null,"abstract":"<p><p>Functionally related genes co-evolve, probably due to the strong selection pressure in evolution. Thus we expect that they are present in multiple genomes. Physical proximity among genes, known as gene team, is a very useful concept to discover functionally related genes in multiple genomes. However, there are also many gene sets that do not preserve physical proximity. In this paper, we generalized the gene team model, that looks for gene clusters in a physically clustered form, to multiple genome cases with relaxed constraint. We propose a novel hybrid pattern model that combines the set and the sequential pattern models. Our model searches for gene clusters with and/or without physical proximity constraint. This model is implemented and tested with 97 genomes (120 replicons). The result was analyzed to show the usefulness of our model. Especially, analysis of gene clusters that belong to B. subtilis and E. coli demonstrated that our model predicted many experimentally verified operons and functionally related clusters. Our program is fast enough to provide a sevice on the web at http://platcom. informatics.indiana.edu/platcom/. Users can select any combination of 97 genomes to predict gene teams.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"44-55"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.33","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25830571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Computational method for temporal pattern discovery in biomedical genomic databases. 生物医学基因组数据库中时间模式发现的计算方法。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.25
Mohammed I Rafiq, Martin J O'Connor, Amar K Das
{"title":"Computational method for temporal pattern discovery in biomedical genomic databases.","authors":"Mohammed I Rafiq,&nbsp;Martin J O'Connor,&nbsp;Amar K Das","doi":"10.1109/csb.2005.25","DOIUrl":"https://doi.org/10.1109/csb.2005.25","url":null,"abstract":"<p><p>With the rapid growth of biomedical research databases, opportunities for scientific inquiry have expanded quickly and led to a demand for computational methods that can extract biologically relevant patterns among vast amounts of data. A significant challenge is identifying temporal relationships among genotypic and clinical (phenotypic) data. Few software tools are available for such pattern matching, and they are not interoperable with existing databases. We are developing and validating a novel software method for temporal pattern discovery in biomedical genomics. In this paper, we present an efficient and flexible query algorithm (called TEMF) to extract statistical patterns from time-oriented relational databases. We show that TEMF - as an extension to our modular temporal querying application (Chronus II) - can express a wide range of complex temporal aggregations without the need for data processing in a statistical software package. We show the expressivity of TEMF using example queries from the Stanford HIV Database.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"362-5"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.25","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25830781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Investigation into biomedical literature classification using support vector machines. 支持向量机在生物医学文献分类中的应用。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.36
Nalini Polavarapu, Shamkant B Navathe, Ramprasad Ramnarayanan, Abrar ul Haque, Saurav Sahay, Ying Liu
{"title":"Investigation into biomedical literature classification using support vector machines.","authors":"Nalini Polavarapu,&nbsp;Shamkant B Navathe,&nbsp;Ramprasad Ramnarayanan,&nbsp;Abrar ul Haque,&nbsp;Saurav Sahay,&nbsp;Ying Liu","doi":"10.1109/csb.2005.36","DOIUrl":"https://doi.org/10.1109/csb.2005.36","url":null,"abstract":"<p><p>Specific topic search in the PubMed Database, one of the most important information resources for scientific community, presents a big challenge to the users. The researcher typically formulates boolean queries followed by scanning the retrieved records for relevance, which is very time consuming and error prone. We applied Support Vector Machines (SVM) for automatic retrieval of PubMed articles related to Human genome epidemiological research at CDC (Center for disease Control and Prevention). In this paper, we discuss various investigations into biomedical literature classification and analyze the effect of various issues related to the choice of keywords, training sets, kernel functions and parameters for the SVM technique. We report on the various factors above to show that SVM is a viable technique for automatic classification of biomedical literature into topics of interest such as epidemiology, cancer, birth defects etc. In all our experiments, we achieved high values of PPV, sensitivity and specificity.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"366-74"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.36","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25830782","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Discover true association rates in multi-protein complex proteomics data sets. 发现多蛋白复杂蛋白质组学数据集的真实关联率。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.29
Changyu Shen, Lang Li, Jake Yue Chen
{"title":"Discover true association rates in multi-protein complex proteomics data sets.","authors":"Changyu Shen,&nbsp;Lang Li,&nbsp;Jake Yue Chen","doi":"10.1109/csb.2005.29","DOIUrl":"https://doi.org/10.1109/csb.2005.29","url":null,"abstract":"<p><p>Experimental processes to collect and process proteomics data are increasingly complex, while the computational methods to assess the quality and significance of these data remain unsophisticated. These challenges have led to many biological oversights and computational misconceptions. We developed a complete empirical Bayes model to analyze multi-protein complex (MPC) proteomics data derived from peptide mass spectrometry detections of purified protein complex pull-down experiments. Our model considers not only bait-prey associations, but also prey-prey associations missed in previous work. Using our model and a yeast MPC proteomics data set, we estimated that there should be an average of 28 true associations per MPC, almost ten times as high as was previously estimated. For data sets generated to mimic a real proteome, our model achieved on average 80% sensitivity in detecting true associations, as compared with the 3% sensitivity in previous work, while maintaining a comparable false discovery rate of 0.3%.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"167-74"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.29","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25829943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Motif extraction and protein classification. 基序提取和蛋白质分类。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.39
Vered Kunik, Zach Solan, Shimon Edelman, Eytan Ruppin, David Horn
{"title":"Motif extraction and protein classification.","authors":"Vered Kunik,&nbsp;Zach Solan,&nbsp;Shimon Edelman,&nbsp;Eytan Ruppin,&nbsp;David Horn","doi":"10.1109/csb.2005.39","DOIUrl":"https://doi.org/10.1109/csb.2005.39","url":null,"abstract":"<p><p>We present a novel unsupervised method for extracting meaningful motifs from biological sequence data. This de novo motif extraction (MEX) algorithm is data driven, finding motifs that are not necessarily over-represented in the data. Applying MEX to the oxidoreductases class of enzymes, containing approximately 7000 enzyme sequences, a relatively small set of motifs is obtained. This set spans a motif-space that is used for functional classification of the enzymes by an SVM classifier. The classification based on MEX motifs surpasses that of two other SVM based methods: SVMProt, a method based on the analysis of physical-chemical properties of a protein generated from its sequence of amino acids, and SVM applied to a Smith-Waterman distances matrix. Our findings demonstrate that the MEX algorithm extracts relevant motifs, supporting a successful sequence-to-function classification.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"80-5"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.39","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25829992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
An efficient and accurate algorithm for assigning nuclear overhauser effect restraints using a rotamer library ensemble and residual dipolar couplings. 一种利用转子库集合和剩余偶极耦合分配核检修器效应约束的有效而精确的算法。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.13
Lincong Wang, Bruce Randall Donald
{"title":"An efficient and accurate algorithm for assigning nuclear overhauser effect restraints using a rotamer library ensemble and residual dipolar couplings.","authors":"Lincong Wang,&nbsp;Bruce Randall Donald","doi":"10.1109/csb.2005.13","DOIUrl":"https://doi.org/10.1109/csb.2005.13","url":null,"abstract":"<p><p>Nuclear Overhauser effect (NOE) distance restraints are the main experimental data from protein nuclear magnetic resonance (NMR) spectroscopy for computing a complete three dimensional solution structure including sidechain conformations. In general, NOE restraints must be assigned before they can be used in a structure determination program. NOE assignment is very time-consuming to do manually, challenging to fully automate, and has become a key bottleneck for high-throughput NMR structure determination. The difficulty in automated NOE assignment is ambiguity: there can be tens of possible different assignments for an NOE peak based solely on its chemical shifts. Previous automated NOE assignment approaches rely on an ensemble of structures, computed from a subset of all the NOEs, to iteratively filter ambiguous assignments. These algorithms are heuristic in nature, provide no guarantees on solution quality or running time, and are slow in practice. In this paper we present an accurate, efficient NOE assignment algorithm. The algorithm first invokes the algorithm in [30, 29] to compute an accurate backbone structure using only two backbone residual dipolar couplings (RDCs) per residue. The algorithm then filters ambiguous NOE assignments by merging an ensemble of intra-residue vectors from a protein rotamer database, together with internuclear vectors from the computed backbone structure. The protein rotamer database was built from ultra-high resolution structures (<1.0 A) in the Protein Data Bank (PDB). The algorithm has been successfully applied to assign more than 1,700 NOE distance restraints with better than 90% accuracy on the protein human ubiquitin using real experimentally-recorded NMR data. The algorithm assigns these NOE restraints in less than one second on a single-processor workstation.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"189-202"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.13","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25829945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Discriminative discovery of transcription factor binding sites from location data. 从定位数据中鉴别发现转录因子结合位点。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.30
Yuji Kawada, Yasubumi Sakakibara
{"title":"Discriminative discovery of transcription factor binding sites from location data.","authors":"Yuji Kawada,&nbsp;Yasubumi Sakakibara","doi":"10.1109/csb.2005.30","DOIUrl":"https://doi.org/10.1109/csb.2005.30","url":null,"abstract":"<p><strong>Motivation: </strong>The availability of genome-wide location analyses based on chromatin immunoprecipitation (ChIP) data gives a new insight for in silico analysis of transcriptional regulations.</p><p><strong>Results: </strong>We propose a novel discriminative discovery framework for precisely identifying transcriptional regulatory motifs from both positive and negative samples (sets of upstream sequences of both bound and unbound genes by a transcription factor (TF)) based on the genome-wide location data. In this framework, our goal is to find such discriminative motifs that best explain the location data in the sense that the motifs precisely discriminate the positive samples from the negative ones. First, in order to discover an initial set of discriminative substrings between positive and negative samples, we apply a decision tree learning method which produces a text-classification tree. We extract several clusters consisting of similar substrings from the internal nodes of the learned tree. Second, we start with initial profile-HMMs constructed from each cluster for representing putative motifs and iteratively refine the profile-HMMs to improve the discrimination accuracies. Our genome-wide experimental results on yeast show that our method successfully identifies the consensus sequences for known TFs in the literature and further presents significant performances for discriminating between positive and negative samples in all the TFs, while most other motif detecting methods show very poor performances on the problem of discriminations. Our learned profile-HMMs also improve false negative predictions of ChIP data.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"86-9"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.30","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25829993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Efficient algorithms and software for detection of full-length LTR retrotransposons. 全长LTR反转录转座子检测的有效算法和软件。
Proceedings. IEEE Computational Systems Bioinformatics Conference Pub Date : 2005-01-01 DOI: 10.1109/csb.2005.31
Anantharaman Kalyanaraman, Srinivas Aluru
{"title":"Efficient algorithms and software for detection of full-length LTR retrotransposons.","authors":"Anantharaman Kalyanaraman,&nbsp;Srinivas Aluru","doi":"10.1109/csb.2005.31","DOIUrl":"https://doi.org/10.1109/csb.2005.31","url":null,"abstract":"<p><p>LTR retrotransposons constitute one of the most abundant classes of repetitive elements in eukaryotic genomes. In this paper, we present a new algorithm for detection of full-length LTR retrotransposons in genomic sequences. The algorithm identifies regions in a genomic sequence that show structural characteristics of LTR retrotransposons. Three key components distinguish our algorithm from that of current software - (i) a novel method that preprocesses the entire genomic sequence in linear time and produces high quality pairs of LTR candidates in running time that is constant per pair, (ii) a thorough alignment-based evaluation of candidate pairs to ensure high quality prediction, and (iii) a robust parameter set encompassing both structural constraints and quality controls providing users with a high degree of flexibility. Validation of both our serial and parallel implementations of the algorithm against the yeast genome indicates both superior quality and performance results when compared to existing software.</p>","PeriodicalId":87417,"journal":{"name":"Proceedings. IEEE Computational Systems Bioinformatics Conference","volume":" ","pages":"56-64"},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/csb.2005.31","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"25830572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信