Proceedings of the ... Asia-Pacific bioinformatics conference最新文献

筛选
英文 中文
Trends in Codon and Amino Acid Usage in Human Pathogen Tropheryma Whipplei, the only Known Actinobacteria with Reduced Genome 人类致病菌惠氏滋养菌(唯一已知的基因组减少的放线菌)密码子和氨基酸使用趋势
Proceedings of the ... Asia-Pacific bioinformatics conference Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0017
Sabyasachi Das, Sandip Paul, C. Dutta
{"title":"Trends in Codon and Amino Acid Usage in Human Pathogen Tropheryma Whipplei, the only Known Actinobacteria with Reduced Genome","authors":"Sabyasachi Das, Sandip Paul, C. Dutta","doi":"10.1142/9781860947292_0017","DOIUrl":"https://doi.org/10.1142/9781860947292_0017","url":null,"abstract":"The factors governing codon and amino acid usages in the predicted protein-coding sequences of Tropheryma whipplei TW08/27 and Twist genomes have been analyzed. Multivariate analysis identifies the replicational-transcriptional selection coupled with DNA strand-specific asymmetric mutational bias as a major driving force behind the significant inter-strand variations in synonymous codon usage patterns in T. whipplei genes, while a residual intra-strand synonymous codon bias is imparted by a selection force operating at the level of translation. The strand-specific mutational pressure has little influence on the amino acid usage, for which the mean hydropathy level and aromaticity are the major sources of variation, both having nearly equal impact. In spite of the intracellular life-style, the amino acid usage in highly expressed gene products of T. whipplei follows the cost-minimization hypothesis. Both the genomes under study are characterized by the presence of two distinct groups of membrane-associated genes, products of which exhibit significant differences in primary and potential secondary structures as well as in the propensity of protein disorder.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76356422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Discriminative Detection of Cis-Acting Regulatory Variation From Location Data 基于位置数据的顺式调控变异判别检测
Proceedings of the ... Asia-Pacific bioinformatics conference Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0012
Yu Kawada, Y. Sakakibara
{"title":"Discriminative Detection of Cis-Acting Regulatory Variation From Location Data","authors":"Yu Kawada, Y. Sakakibara","doi":"10.1142/9781860947292_0012","DOIUrl":"https://doi.org/10.1142/9781860947292_0012","url":null,"abstract":"The interaction between transcription factors and their DNA binding sites plays a key role for understanding gene regulation mechanisms. Recent studies revealed the presence of ifunctional polymorphismi in genes that is dened as regulatory variation measured in transcription levels due to the cis-acting sequence differences. These regulatory variants are assumed to contribute to modulating gene functions. However, computational identica tions of such functional cis-regulatory variants is a much greater challenge than just identifying consensus sequences, because cis-regulatory variants differ by only a few bases from the main consensus sequences, while they have important consequences for organismal phenotype. None of the previous studies have directly addressed this problem. We propose a novel discriminative detection method for precisely identifying transcription factor binding sites and their functional variants from both positive and negative samples (sets of upstream sequences of both bound and unbound genes by a transcription factor) based on the genome-wide location data. Our goal is to nd such discriminative substrings that best explain the location data in the sense that the substrings precisely discriminate the positive samples from the negative ones rather than nding the substrings that are simply over-represented among the positive ones. Our method consists of two steps: First, we apply a decision tree learning method to discover discriminative substrings and a hierarchical relationship among them. Second, we extract a main motif and further a second motif as a cis-regulatory variant by utilizing functional annotations. Our genome-wide experimental results on yeast Saccharomyces cerevisiae show that our method presented signicantly better performances for detecting experimentally veried consensus sequences than current motif detecting methods. In addition, our method has successfully discovered second motifs of putative functional cis-regulatory variants which are associated with genes of different functional annotations, and the correctness of those variants have been veried by expression prole analyses.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91449792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Randomized Algorithm for Learning Mahalanobis Metrics: Application to Classification and Regression of Biological Data 一种学习马氏度量的随机算法:在生物数据分类和回归中的应用
Proceedings of the ... Asia-Pacific bioinformatics conference Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0025
C. Langmead
{"title":"A Randomized Algorithm for Learning Mahalanobis Metrics: Application to Classification and Regression of Biological Data","authors":"C. Langmead","doi":"10.1142/9781860947292_0025","DOIUrl":"https://doi.org/10.1142/9781860947292_0025","url":null,"abstract":"We present a randomized algorithm for semi-supervised learning of Mahalanobis metrics over Rn. The inputs to the algorithm are a set, U , of unlabeled points in Rn, a set of pairs of points, S = {(x, y)i};x, y ∈ U , that are known to be similar, and a set of pairs of points, D = {(x, y)i};x, y ∈ U , that are known to be dissimilar. The algorithm randomly samples S, D, and m-dimensional subspaces of Rn and learns a metric for each subspace. The metric over Rn is a linear combination of the subspace metrics. The randomization addresses issues of efficiency and overfitting. Extensions of the algorithm to learning non-linear metrics via kernels, and as a pre-processing step for dimensionality reduction are discussed. The new method is demonstrated on a regression problem (structure-based chemical shift prediction) and a classification problem (predicting clinical outcomes for immunomodulatory strategies for treating severe sepsis).","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80515708","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A More Accurate and Efficient Whole Genome Phylogeny 一个更准确和有效的全基因组系统发育
Proceedings of the ... Asia-Pacific bioinformatics conference Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0037
P. Chan, T. Lam, S. Yiu
{"title":"A More Accurate and Efficient Whole Genome Phylogeny","authors":"P. Chan, T. Lam, S. Yiu","doi":"10.1142/9781860947292_0037","DOIUrl":"https://doi.org/10.1142/9781860947292_0037","url":null,"abstract":"To reconstruct a phylogeny for a given set of species, most of the previous approaches are based on the similarity information derived from a subset of conserved regions (or genes) in the corresponding genomes. In some cases, the regions chosen may not reflect the evolutionary history of the species and may be too restricted to differentiate the species. It is generally believed that the inference could be more accurate if whole genomes are being considered. The best existing solution that makes use of complete genomes was proposed by Henz et al.13 They can construct a phylogeny for 91 prokaryotic genomes in 170 CPU hours with an accuracy of about 70% (based on the measurement of non-trivial splits) while other approaches that use whole genomes can only deal with no more than 20 species. Note that Henz et al. measure the distance between the species using BLASTN which is not primarily designed for whole genome alignment. Also, their approach is not scalable, for example, it probably takes over 1000 CPU hours to construct a phylogeny for all 230 prokaryotic genomes published by NCBI. In addition, we found that non-trivial splits is only a rough indicator of the accuracy of the phylogeny. In this paper, we propose the followings. (1) To evaluate the quality of a phylogeny with respect to a model answer, we suggest to use the concept of the maximum agreement subtree as it can capture the structure of the phylogeny. (2) We propose to use whole genome alignment software (such as MUMmer) to measure the distances between the species and derive an efficient approach to generate these distances. From the experiments on real data sets, we found that our approach is more accurate and more scalable than Henz et al.’s approach. We can construct a phylogenetic tree for the same set of 91 genomes with an accuracy more than 20% higher (with respect to both evaluation measures) in 2 CPU hours (more than 80 times faster than their approach). Also, our approach is scalable and can construct a phylogeny for 230 prokaryotic genomes with accuracy as high as 85% in only 9.5 CPU hours.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74768787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
ONBRIRES: Ontology-Based Biological Relation Extraction System 基于本体的生物关系提取系统
Proceedings of the ... Asia-Pacific bioinformatics conference Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0036
Minlie Huang, Xiaoyan Zhu, Shilin Ding, Hao Yu, Ming Li
{"title":"ONBRIRES: Ontology-Based Biological Relation Extraction System","authors":"Minlie Huang, Xiaoyan Zhu, Shilin Ding, Hao Yu, Ming Li","doi":"10.1142/9781860947292_0036","DOIUrl":"https://doi.org/10.1142/9781860947292_0036","url":null,"abstract":"Automated discovery and extraction of biological relations from online documents, particularly MEDLINE texts, has become essential and urgent because such literature data are accumulated in a tremendous growth. In this paper, we present an ontology-based framework of biological relation extraction system. This framework is unified and able to extract several kinds of relations such as gene-disease, gene-gene, and protein-protein interactions etc. The main contributions of this paper are that we propose a two-level pattern learning algorithm, and organize patterns hierarchically.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79863044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Analyzing Inconsistency Toward Enhancing Integration of Biological Molecular Databases 分析不一致性促进生物分子数据库整合
Proceedings of the ... Asia-Pacific bioinformatics conference Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0023
Y. Chen, Qingfeng Chen
{"title":"Analyzing Inconsistency Toward Enhancing Integration of Biological Molecular Databases","authors":"Y. Chen, Qingfeng Chen","doi":"10.1142/9781860947292_0023","DOIUrl":"https://doi.org/10.1142/9781860947292_0023","url":null,"abstract":"The rapid growth of biological databases not only provides biologists with abundant data but also presents a big challenge in relation to the analysis of data. Many data analysis approaches such as data mining, information retrieval and machine learning have been used to extract frequent patterns from diverse biological databases. However, the discrepancies, due to the differences in the structure of databases and their terminologies, result in a significant lack of interoperability. Although ontology-based approaches have been used to integrate biological databases, the inconsistent analysis of biological databases has been greatly disregarded. This paper presents a method by which to measure the degree of inconsistency between biological databases. It not only presents a guideline for correct and efficient database integration, but also exposes high quality data for data mining and knowledge discovery.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82992132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Accuracy of Four Heuristics for the Full Sibship Reconstruction Problem in the Presence of Genotype Errors 存在基因型错误的四种启发式全兄弟姐妹重构问题的准确性
Proceedings of the ... Asia-Pacific bioinformatics conference Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0004
D. Konovalov
{"title":"Accuracy of Four Heuristics for the Full Sibship Reconstruction Problem in the Presence of Genotype Errors","authors":"D. Konovalov","doi":"10.1142/9781860947292_0004","DOIUrl":"https://doi.org/10.1142/9781860947292_0004","url":null,"abstract":"The full sibship reconstruction (FSR) problem is the problem of inferring all groups of full siblings from a given population sample using genetic marker data without parental information. The FSR problem remains a significant challenge for computational biology, since an exact solution for the problem has not been found. The new algorithm, named SIMPSON-assisted Descending Ratio (SDR), is devised combining a new Simpson index based O(n2) algorithm (MS2) and the existing Descending Ratio (DR) algorithm. The SDR algorithm outperforms the SIMPSON, MS2, and DR algorithms in accuracy and robustness when tested on a variety of sample family structures. The accuracy error is measured as the percentage of incorrectly assigned individuals. The robustness of the FSR algorithms is assessed by simulating a 2% mutation rate per locus (a 1% rate per allele).","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88588512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Predicting Ranked SCOP Domains by Mining Associations of Visual Contents in Distance Matrices 通过挖掘距离矩阵中视觉内容的关联预测SCOP排序域
Proceedings of the ... Asia-Pacific bioinformatics conference Pub Date : 2005-12-01 DOI: 10.1142/9781860947292_0008
Pin-Hao Chi, C. Shyu
{"title":"Predicting Ranked SCOP Domains by Mining Associations of Visual Contents in Distance Matrices","authors":"Pin-Hao Chi, C. Shyu","doi":"10.1142/9781860947292_0008","DOIUrl":"https://doi.org/10.1142/9781860947292_0008","url":null,"abstract":"Protein tertiary structures are known to have significant correlations with their biological functions. To understand the information of the protein structures, Structural Classification of Protein (SCOP) Database, which is manually constructed by human experts, classifies similar protein folds in the same domain hierarchy. Even though this approach is believed to be more reliable than applying traditional alignment methods in structural classifications, it is labor intensive. In this paper, we build a non-parametric classifier to predict possible SCOP domains for unknown protein structures. With supervised learning, the algorithm first maps tertiary structures of training proteins into two-dimensional distance matrices, and then extracts signatures from visual contents of matrices. A knowledge discovery and data mining (KDD) process further discovers relevant patterns in training signatures of each SCOP domain by mining association rules. Finally, the quantity of rules whose patterns match signatures of unknown proteins determines predicted domains in a ranked order. We select 7,702 protein chains from 150 domains of SCOP database 1.67 release as labelled data using 10 fold cross validation. Experimental results show that the prediction accuracy is 91.27% for the top ranked domain and 99.22% for the top 5 ranked domains. The average response time takes 6.34 seconds, exhibiting reasonably high prediction accuracy and efficiency.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2005-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84618792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Cells In Silico (CIS): A biomedical simulation framework based on Markov random field 基于马尔可夫随机场的生物医学模拟框架
Proceedings of the ... Asia-Pacific bioinformatics conference Pub Date : 2005-01-01 DOI: 10.1142/9781860947322_0015
Kung-Hao Liang
{"title":"Cells In Silico (CIS): A biomedical simulation framework based on Markov random field","authors":"Kung-Hao Liang","doi":"10.1142/9781860947322_0015","DOIUrl":"https://doi.org/10.1142/9781860947322_0015","url":null,"abstract":"This paper presents CIS, a biomedical simulation framework based on the markov random field (MRF). CIS is a discrete domain 2-D simulation framework emphasizing on the spatial interactions of biomedical entities. The probability model within the MRF framework facilitates the construction of more realistic models than deterministic differential equatio n approaches and cellular automata. The global phenomenon in CIS are dictated by the local conditional probabilities. In addition, multiscale MRF is potentially useful for the modelling of complex biomedical phenomenon in multiple spatial and time scales. The methodology and procedure of CIS for a biomedical simulation is presented using the scenario of tumor-induced hypoxia and angiogenesis as an example. The goal of this research is to unveil the complex appearances of biomedical phenomenon using mathematical models, thus enhancing our understanding on the secrets of life. Computational cell biology is an emerging discipline where biomedical simulations are employed for the study of cells and their microenvironments in various spatio-temporal scales. The E-cell and the Virtual Cell projects focus on the molecular and biochemical level within cells, addressing the dynamics of signal transductional, regulatory and metabolic networks. The sub-cell compartmental model are constructed and integrated gradually so as to simulate a particular facet (or pat hway) of cells. The Epitheliome project is an example of tissue-level simulation, aiming to depict the epithelial cell growth and the social behavior of cells in culture. Simulations on higher-level systems include Physiome, and the modelling of many organs such as heart. Each scale of simulation shed light on different aspects of life. Biomedical simulations have been conducted in both the continuous and discrete domains. Differential equations are the key elements of continuous domain simulation, where the concentration of particular receptors, ligands, enzymes or metabolites are modelled at various spatial and temporal scales. This approach is limited by the fact that many biomedical phenomena are too complex to be described by sets of differential equations. In addition, the deterministic differential equations are not adequate for describing many biological phenomenon with a stochastic nature. Alternatively, discrete domain simulation are processed on a spatio-temporal discrete lattice. T he combination of Pott’s model and Metropolis algorithm have been used to simulate cell sorting, morphogenesis, the behavior of malignant tumor and the Tamoxifen treatment failure of cancer.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80095880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Protein informatics towards integration of data grid and computing grid 迈向数据网格与计算网格融合的蛋白质信息学
Proceedings of the ... Asia-Pacific bioinformatics conference Pub Date : 2005-01-01 DOI: 10.1142/9781860947322_0036
Haruki Nakamura
{"title":"Protein informatics towards integration of data grid and computing grid","authors":"Haruki Nakamura","doi":"10.1142/9781860947322_0036","DOIUrl":"https://doi.org/10.1142/9781860947322_0036","url":null,"abstract":"Information of the structures and functions of protein molecules and their mutual interactions that construct protein networks increases rapidly as the consequence of the structural genomics and structural proteomics projects [1]. Advanced applications of such information require the Grid technology to solve the two problems: (i) the shortage of computational power, and (ii) the lack of a capability for seamlessly and quickly retrieving data from the varieties of heterogeneous biological databases [2].","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2005-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86978836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信