Pujan Joshi, Honglin Wang, B. Basso, S. Hong, C. Giardina, Dong-Guk Shin
{"title":"A Framework for Route Based Pathway Analysis of Gene Expression Data","authors":"Pujan Joshi, Honglin Wang, B. Basso, S. Hong, C. Giardina, Dong-Guk Shin","doi":"10.1145/3449258.3449262","DOIUrl":null,"url":null,"abstract":"Pathway analysis is a key step in genomics study to reduce the data complexity and associate prior biological knowledge. Over representation analysis (ORA), Functional class scoring (FCS), and Topology based (TB) analysis are considered as three generations of pathway analysis techniques. These methods only detect the differential activity of an entire pathway, thereby ignoring the importance of routes and sections within the pathway. A novel route-based pathway analysis framework, Route based Pathway Analysis in Cohorts (rPAC), is discussed in this paper which uses pathway topology in true sense by identifying and scoring individual routes within pathways. Activity scores and p-values are calculated for all signaling and effector routes from KEGG signaling pathways with transcriptomics data from each sample in the given cohort. Overall route activity in a cohort is assessed in terms of two summary metrics, “Proportion of Significance” (PS) and “Average Route Score” (ARS). A systematic evaluation based on large number of simulated data showed rPAC significantly outperforming the traditional pathway analysis methods. Case studies of three epithelial cancers from The Cancer Genome Atlas (TCGA) repository revealed that some pathway routes (e.g., tight junction, Th17 cell differentiation, adipocytokine signaling etc.) can notably differentiate cancer types, while other pathway routes that are related to lipid metabolism and adipocytes metabolism are co-regulated in different cancers. While most of the findings are corroborated by the current understanding of cancer biology, many previously uncharacterized mechanisms were identified by rPAC analysis, exhibiting the potential to yield new insights into cancer phenotypes.","PeriodicalId":278216,"journal":{"name":"Proceedings of the 2020 4th International Conference on Computational Biology and Bioinformatics","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 4th International Conference on Computational Biology and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3449258.3449262","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Pathway analysis is a key step in genomics study to reduce the data complexity and associate prior biological knowledge. Over representation analysis (ORA), Functional class scoring (FCS), and Topology based (TB) analysis are considered as three generations of pathway analysis techniques. These methods only detect the differential activity of an entire pathway, thereby ignoring the importance of routes and sections within the pathway. A novel route-based pathway analysis framework, Route based Pathway Analysis in Cohorts (rPAC), is discussed in this paper which uses pathway topology in true sense by identifying and scoring individual routes within pathways. Activity scores and p-values are calculated for all signaling and effector routes from KEGG signaling pathways with transcriptomics data from each sample in the given cohort. Overall route activity in a cohort is assessed in terms of two summary metrics, “Proportion of Significance” (PS) and “Average Route Score” (ARS). A systematic evaluation based on large number of simulated data showed rPAC significantly outperforming the traditional pathway analysis methods. Case studies of three epithelial cancers from The Cancer Genome Atlas (TCGA) repository revealed that some pathway routes (e.g., tight junction, Th17 cell differentiation, adipocytokine signaling etc.) can notably differentiate cancer types, while other pathway routes that are related to lipid metabolism and adipocytes metabolism are co-regulated in different cancers. While most of the findings are corroborated by the current understanding of cancer biology, many previously uncharacterized mechanisms were identified by rPAC analysis, exhibiting the potential to yield new insights into cancer phenotypes.