João Paulo Cassucci Dos Santos, Odemir Martinez Bruno
{"title":"Application of coincidence index in the discovery of co-expressed metabolic pathways.","authors":"João Paulo Cassucci Dos Santos, Odemir Martinez Bruno","doi":"10.1088/1478-3975/ad68b6","DOIUrl":null,"url":null,"abstract":"<p><p>Analyzing transcription data requires intensive statistical analysis to obtain useful biological information and knowledge. A significant portion of this data is affected by random noise or even noise intrinsic to the modeling of the experiment. Without robust treatment, the data might not be explored thoroughly, and incorrect conclusions could be drawn. Examining the correlation between gene expression profiles is one way bioinformaticians extract information from transcriptomic experiments. However, the correlation measurements traditionally used have worrisome shortcomings that need to be addressed. This paper compares five already published and experimented-with correlation measurements to the newly developed coincidence index, a similarity measurement that combines Jaccard and interiority indexes and generalizes them to be applied to vectors containing real values. We used microarray and RNA-Seq data from the archaeon<i>Halobacterium salinarum</i>and the bacterium<i>Escherichia coli</i>, respectively, to evaluate the capacity of each correlation/similarity measurement. The utilized method explores the co-expressed metabolic pathways by measuring the correlations between the expression levels of enzymes that share metabolites, represented in the form of a weighted graph. It then searches for local maxima in this graph using a simulated annealing algorithm. We demonstrate that the coincidence index extracts larger, more comprehensive, and more statistically significant pathways for microarray experiments. In RNA-Seq experiments, the results are more limited, but the coincidence index managed the largest percentage of significant components in the graph.</p>","PeriodicalId":20207,"journal":{"name":"Physical biology","volume":null,"pages":null},"PeriodicalIF":2.0000,"publicationDate":"2024-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physical biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1088/1478-3975/ad68b6","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMISTRY & MOLECULAR BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Analyzing transcription data requires intensive statistical analysis to obtain useful biological information and knowledge. A significant portion of this data is affected by random noise or even noise intrinsic to the modeling of the experiment. Without robust treatment, the data might not be explored thoroughly, and incorrect conclusions could be drawn. Examining the correlation between gene expression profiles is one way bioinformaticians extract information from transcriptomic experiments. However, the correlation measurements traditionally used have worrisome shortcomings that need to be addressed. This paper compares five already published and experimented-with correlation measurements to the newly developed coincidence index, a similarity measurement that combines Jaccard and interiority indexes and generalizes them to be applied to vectors containing real values. We used microarray and RNA-Seq data from the archaeonHalobacterium salinarumand the bacteriumEscherichia coli, respectively, to evaluate the capacity of each correlation/similarity measurement. The utilized method explores the co-expressed metabolic pathways by measuring the correlations between the expression levels of enzymes that share metabolites, represented in the form of a weighted graph. It then searches for local maxima in this graph using a simulated annealing algorithm. We demonstrate that the coincidence index extracts larger, more comprehensive, and more statistically significant pathways for microarray experiments. In RNA-Seq experiments, the results are more limited, but the coincidence index managed the largest percentage of significant components in the graph.
期刊介绍:
Physical Biology publishes articles in the broad interdisciplinary field bridging biology with the physical sciences and engineering. This journal focuses on research in which quantitative approaches – experimental, theoretical and modeling – lead to new insights into biological systems at all scales of space and time, and all levels of organizational complexity.
Physical Biology accepts contributions from a wide range of biological sub-fields, including topics such as:
molecular biophysics, including single molecule studies, protein-protein and protein-DNA interactions
subcellular structures, organelle dynamics, membranes, protein assemblies, chromosome structure
intracellular processes, e.g. cytoskeleton dynamics, cellular transport, cell division
systems biology, e.g. signaling, gene regulation and metabolic networks
cells and their microenvironment, e.g. cell mechanics and motility, chemotaxis, extracellular matrix, biofilms
cell-material interactions, e.g. biointerfaces, electrical stimulation and sensing, endocytosis
cell-cell interactions, cell aggregates, organoids, tissues and organs
developmental dynamics, including pattern formation and morphogenesis
physical and evolutionary aspects of disease, e.g. cancer progression, amyloid formation
neuronal systems, including information processing by networks, memory and learning
population dynamics, ecology, and evolution
collective action and emergence of collective phenomena.