A. Niida, S. Imoto, Masao Nagasaki, R. Yamaguchi, S. Miyano
{"title":"A novel meta-analysis approach of cancer transcriptomes reveals prevailing transcriptional networks in cancer cells.","authors":"A. Niida, S. Imoto, Masao Nagasaki, R. Yamaguchi, S. Miyano","doi":"10.1142/9781848165786_0010","DOIUrl":"https://doi.org/10.1142/9781848165786_0010","url":null,"abstract":"Although microarray technology has revealed transcriptomic diversities underlining various cancer phenotypes, transcriptional programs controlling them have not been well elucidated. To decode transcriptional programs governing cancer transcriptomes, we have recently developed a computational method termed EEM, which searches for expression modules from prescribed gene sets defined by prior biological knowledge like TF binding motifs. In this paper, we extend our EEM approach to predict cancer transcriptional networks. Starting from functional TF binding motifs and expression modules identified by EEM, we predict cancer transcriptional networks containing regulatory TFs, associated GO terms, and interactions between TF binding motifs. To systematically analyze transcriptional programs in broad types of cancer, we applied our EEM-based network prediction method to 122 microarray datasets collected from public databases. The data sets contain about 15000 experiments for tumor samples of various tissue origins including breast, colon, lung etc. This EEM based meta-analysis successfully revealed a prevailing cancer transcriptional network which functions in a large fraction of cancer transcriptomes; they include cell-cycle and immune related sub-networks. This study demonstrates broad applicability of EEM, and opens a way to comprehensive understanding of transcriptional networks in cancer cells.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"637 1","pages":"121-31"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76816021","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Different groups of metabolic genes cluster around early and late firing origins of replication in budding yeast.","authors":"T. Spiesser, E. Klipp","doi":"10.1142/9781848166585_0015","DOIUrl":"https://doi.org/10.1142/9781848166585_0015","url":null,"abstract":"DNA replication is a fundamental process that is tightly regulated during the cell cycle. In budding yeast it starts from multiple origins of replication and proceeds in a timely fashion according to a reproducible temporal program until the entire DNA is replicated exactly once per cell cycle. In this program an origin seems to have an inherent firing probability at a specific time in S-phase that is conserved over the population. However, what exactly determines the origin initiation time remains obscure. In this work, we analyze the gene content that clusters around replication origins following the assumption that inherent origin properties that determine staggered initiation times could potentially be mirrored in the close origin proximity. We perform a Gene Ontology term enrichment test and find that metabolic genes are significantly over-represented in the regions that are close to the starting points of DNA replication. Furthermore, functional analysis also reveals that catabolic genes cluster around early firing origins, whereas anabolic genes can rather be found in the proximity of late firing origins of replication. We speculate that, in budding yeast, gene function around replication origins correlates with their intrinsic probability to initiate DNA replication at a given point in S-phase.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"56 1","pages":"179-92"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90923687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Takeyuki Tamura, Nils Christian, Kazuhiro Takemoto, O. Ebenhöh, T. Akutsu
{"title":"Analysis and prediction of nutritional requirements using structural properties of metabolic networks and support vector machines.","authors":"Takeyuki Tamura, Nils Christian, Kazuhiro Takemoto, O. Ebenhöh, T. Akutsu","doi":"10.1142/9781848165786_0015","DOIUrl":"https://doi.org/10.1142/9781848165786_0015","url":null,"abstract":"Properties of graph representation of genome scale metabolic networks have been extensively studied. However, the relationship between these structural properties and functional properties of the networks are still very unclear. In this paper, we focus on nutritional requirements of organisms as a functional property and study the relationship with structural properties of a graph representation of metabolic networks. In order to examine the relationship, we study to what extent the nutritional requirements can be predicted by using support vector machines from structural properties, which include degree exponent, edge density, clustering coefficient, degree centrality, closeness centrality, betweenness centrality and eigenvector centrality. Furthermore, we study which properties are influential to the nutritional requirements.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"72 1","pages":"176-90"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91040267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michihiro Tanaka, Yuki Moriya, S. Goto, M. Kanehisa
{"title":"Analysis of a lipid biosynthesis protein family and phospholipid structural variations.","authors":"Michihiro Tanaka, Yuki Moriya, S. Goto, M. Kanehisa","doi":"10.1142/9781848165786_0016","DOIUrl":"https://doi.org/10.1142/9781848165786_0016","url":null,"abstract":"Glycerophospholipids are major structural lipids in cellular membrane systems and play key roles as suppliers of the first and second messengers in the signal transduction and molecular recognition processes. The distribution of lipid components differs among organelles and cells. The distribution is controlled by two pathways in lipid metabolism: de nova and remodeling pathways. Glycerophospholipids including arachidonic and stearic acids are mostly produced in the remodeling pathway, whereas lipid chains are reconstructed from those synthesized in the de novo pathway. Recently lysophospholipid acyltransferases have been isolated as key enzymes in the remodeling pathway, and the substrate specificity has been investigated in terms of the chemical substructures of glycerophospholipids, such as the type of head groups and the length of aliphatic chains. These experimental studies have been reported for specific organisms, and only two representative sequence motifs are known for acyltransferases: a general pattern and the pattern for membrane-bound O-acyltransferase (MBOAT). Here we attempt to correlate the sequence patterns and the substrate specificity of lysophospholipid acyltransferases in 89 eukaryotic genomes in order to understand the roles of this enzyme family and underlying glycerophospholipid structural variations. Using phylogenetic and domain analyses, the lysophospholipid acyltransferase family was divided into 18 subtypes. Furthermore, we examined the occurrence of identified subtypes in eukaryotic genomes, and found the expansion of these subtypes in vertebrates. These findings may provide clues to understanding structural variations and distributions of glycerophospholipids in different organisms.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"70 1","pages":"191-201"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86195115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Characterizing common substructures of ligands for GPCR protein subfamilies.","authors":"Bekir Erguner, M. Hattori, S. Goto, M. Kanehisa","doi":"10.1142/9781848166585_0003","DOIUrl":"https://doi.org/10.1142/9781848166585_0003","url":null,"abstract":"The G-protein coupled receptor (GPCR) superfamily is the largest class of proteins with therapeutic value. More than 40% of present prescription drugs are GPCR ligands. The high therapeutic value of GPCR proteins and recent advancements in virtual screening methods gave rise to many virtual screening studies for GPCR ligands. However, in spite of vast amounts of research studying their functions and characteristics, 3D structures of most GPCRs are still unknown. This makes target-based virtual screenings of GPCR ligands extremely difficult, and successful virtual screening techniques rely heavily on ligand information. These virtual screening methods focus on specific features of ligands on GPCR protein level, and common features of ligands on higher levels of GPCR classification are yet to be studied. Here we extracted common substructures of GPCR ligands of GPCR protein subfamilies. We used the SIMCOMP, a graph-based chemical structure comparison program, and hierarchical clustering to reveal common substructures. We applied our method to 850 GPCR ligands and we found 53 common substructures covering 439 ligands. These substructures contribute to deeper understanding of structural features of GPCR ligands which can be used in new drug discovery methods.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"10 3","pages":"31-41"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1142/9781848166585_0003","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72475192","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integer programming-based method for completing signaling pathways and its application to analysis of colorectal cancer.","authors":"Takeyuki Tamura, Yoshihiro Yamanishi, Mao Tanabe, Susumu Goto, Minoru Kanehisa, Katsuhisa Horimoto, Tatsuya Akutsu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Signaling pathways are often represented by networks where each node corresponds to a protein and each edge corresponds to a relationship between nodes such as activation, inhibition and binding. However, such signaling pathways in a cell may be affected by genetic and epigenetic alteration. Some edges may be deleted and some edges may be newly added. The current knowledge about known signaling pathways is available on some public databases, but most of the signaling pathways including changes upon the cell state alterations remain largely unknown. In this paper, we develop an integer programming-based method for inferring such changes by using gene expression data. We test our method on its ability to reconstruct the pathway of colorectal cancer in the KEGG database.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"24 ","pages":"193-203"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30251243","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gene regulatory network clustering for graph layout based on microarray gene expression data.","authors":"Kaname Kojima, Seiya Imoto, Masao Nagasaki, Satoru Miyano","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>We propose a statistical model realizing simultaneous estimation of gene regulatory network and gene module identification from time series gene expression data from microarray experiments. Under the assumption that genes in the same module are densely connected, the proposed method detects gene modules based on the variational Bayesian technique. The model can also incorporate existing biological prior knowledge such as protein subcellular localization. We apply the proposed model to the time series data from a synthetically generated network and verified the effectiveness of the proposed model. The proposed model is also applied the time series microarray data from HeLa cell. Detected gene module information gives the great help on drawing the estimated gene network.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"24 ","pages":"84-95"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"30252338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A systems biology approach: modelling of Aquaporin-2 trafficking.","authors":"M. Fröhlich, P. Deen, E. Klipp","doi":"10.1142/9781848166585_0004","DOIUrl":"https://doi.org/10.1142/9781848166585_0004","url":null,"abstract":"In healthy individuals, dehydration of the body leads to release of the hormone vasopressin from the pituitary. Via the bloodstream, vasopressin reaches the collecting duct cells in the kidney, where the water channel Aquaporin-2 (AQP2) is expressed. After stimulation of the vasopressin V2 receptor by vasopressin, intracellular AQP2-containing vesicles fuse with the apical plasma membrane of the collecting duct cells. This leads to increased water reabsorption from the pro-urine into the blood and therefore to enhanced retention of water within the body. Using existing biological data we propose a mathematical model of AQP-2 trafficking and regulation in collecting duct cells. Our model includes the vasopressin receptor, adenylate cyclase, protein kinase A, and intracellular as well as membrane located AQP2. To model the chemical reactions we used ordinary differential equations (ODEs) based on mass action kinetics. We employ known protein concentrations and time series data to estimate the kinetic parameters of our model and demonstrate its validity. Through generating, testing and ranking different versions of the model, we show that some model versions can describe the data well as soon as important regulatory parts such as the reduction of the signal by internalization of the vasopressin-receptor or the negative feedback loop representing phosphodiesterase activity are included. We perform time-dependent sensitivity analysis to identify the reactions that have the greatest influence on the cAMP and membrane located AQP2 levels over time. We predict the time courses for membrane located AQP2 at different vasopressin concentrations, compare them with newly generated data and discuss the competencies of the model.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"1 1","pages":"42-55"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79455469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analyzing gene coexpression data by an evolutionary model.","authors":"M. Schütte, M. Mutwil, S. Persson, O. Ebenhöh","doi":"10.1142/9781848166585_0013","DOIUrl":"https://doi.org/10.1142/9781848166585_0013","url":null,"abstract":"Coexpressed genes are tentatively translated into proteins that are involved in similar biological functions. Here, we constructed gene coexpression networks from collected microarray data of the organisms Arabidopsis thaliana, Saccharomyces cerevisiae, and Escherichia coli. Their degree distributions show the common property of an overrepresentation of highly connected nodes followed by a sudden truncation. In order to analyze this behavior, we present an evolutionary model simulating the genetic evolution. This model assumes that new genes emerge by duplication from a small initial set of primordial genes. Our model does not include the removal of unused genes but selective pressure is indirectly taken into account by preferentially duplicating the old genes. Thus, gene duplication represents the emergence of a new gene and its successful establishment. After a duplication event, all genes are slightly but iteratively mutated, thus altering their expression patterns. Our model is capable of reproducing global properties of the investigated coexpression networks. We show that our model reflects the mean inter-node distances and especially the characteristic humps in the degree distribution that, in the biological examples, result from functionally related genes.","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"33 1","pages":"154-63"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83831842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Active pathway identification and classification with probabilistic ensembles.","authors":"Timothy Hancock, Hiroshi Mamitsuka","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>A popular means of modeling metabolic networks is through identifying frequently observed pathways. However the definition of what constitutes an observation of a pathway and how to evaluate the importance of identified pathways remains unclear. In this paper we investigate different methods for defining an observed pathway and evaluate their performance with pathway classification models. We use three methods for defining an observed pathway; a path in gene over-expression, a path in probable gene over-expression and a path of most accurate classification. The performance of each definition is evaluated with three classification models; a probabilistic pathway classifier - HME3M, logistic regression and SVM. The results show that defining pathways using the probability of gene over-expression creates stable and accurate classifiers. Conversely we also show defining pathways of most accurate classification finds a severely biased pathways that are unrepresentative of underlying microarray data structure.</p>","PeriodicalId":73143,"journal":{"name":"Genome informatics. International Conference on Genome Informatics","volume":"22 ","pages":"30-40"},"PeriodicalIF":0.0,"publicationDate":"2010-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"28783007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}