A Framework for Route Based Pathway Analysis of Gene Expression Data

Pujan Joshi, Honglin Wang, B. Basso, S. Hong, C. Giardina, Dong-Guk Shin
{"title":"A Framework for Route Based Pathway Analysis of Gene Expression Data","authors":"Pujan Joshi, Honglin Wang, B. Basso, S. Hong, C. Giardina, Dong-Guk Shin","doi":"10.1145/3449258.3449262","DOIUrl":null,"url":null,"abstract":"Pathway analysis is a key step in genomics study to reduce the data complexity and associate prior biological knowledge. Over representation analysis (ORA), Functional class scoring (FCS), and Topology based (TB) analysis are considered as three generations of pathway analysis techniques. These methods only detect the differential activity of an entire pathway, thereby ignoring the importance of routes and sections within the pathway. A novel route-based pathway analysis framework, Route based Pathway Analysis in Cohorts (rPAC), is discussed in this paper which uses pathway topology in true sense by identifying and scoring individual routes within pathways. Activity scores and p-values are calculated for all signaling and effector routes from KEGG signaling pathways with transcriptomics data from each sample in the given cohort. Overall route activity in a cohort is assessed in terms of two summary metrics, “Proportion of Significance” (PS) and “Average Route Score” (ARS). A systematic evaluation based on large number of simulated data showed rPAC significantly outperforming the traditional pathway analysis methods. Case studies of three epithelial cancers from The Cancer Genome Atlas (TCGA) repository revealed that some pathway routes (e.g., tight junction, Th17 cell differentiation, adipocytokine signaling etc.) can notably differentiate cancer types, while other pathway routes that are related to lipid metabolism and adipocytes metabolism are co-regulated in different cancers. While most of the findings are corroborated by the current understanding of cancer biology, many previously uncharacterized mechanisms were identified by rPAC analysis, exhibiting the potential to yield new insights into cancer phenotypes.","PeriodicalId":278216,"journal":{"name":"Proceedings of the 2020 4th International Conference on Computational Biology and Bioinformatics","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 4th International Conference on Computational Biology and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3449258.3449262","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Pathway analysis is a key step in genomics study to reduce the data complexity and associate prior biological knowledge. Over representation analysis (ORA), Functional class scoring (FCS), and Topology based (TB) analysis are considered as three generations of pathway analysis techniques. These methods only detect the differential activity of an entire pathway, thereby ignoring the importance of routes and sections within the pathway. A novel route-based pathway analysis framework, Route based Pathway Analysis in Cohorts (rPAC), is discussed in this paper which uses pathway topology in true sense by identifying and scoring individual routes within pathways. Activity scores and p-values are calculated for all signaling and effector routes from KEGG signaling pathways with transcriptomics data from each sample in the given cohort. Overall route activity in a cohort is assessed in terms of two summary metrics, “Proportion of Significance” (PS) and “Average Route Score” (ARS). A systematic evaluation based on large number of simulated data showed rPAC significantly outperforming the traditional pathway analysis methods. Case studies of three epithelial cancers from The Cancer Genome Atlas (TCGA) repository revealed that some pathway routes (e.g., tight junction, Th17 cell differentiation, adipocytokine signaling etc.) can notably differentiate cancer types, while other pathway routes that are related to lipid metabolism and adipocytes metabolism are co-regulated in different cancers. While most of the findings are corroborated by the current understanding of cancer biology, many previously uncharacterized mechanisms were identified by rPAC analysis, exhibiting the potential to yield new insights into cancer phenotypes.
基于路径的基因表达数据通路分析框架
途径分析是基因组学研究中降低数据复杂性和关联先验生物学知识的关键步骤。超表征分析(ORA)、功能类评分(FCS)和基于拓扑(TB)的分析被认为是三代路径分析技术。这些方法只检测整个通路的差异活动,从而忽略了通路内的路线和部分的重要性。本文讨论了一种新的基于路由的路径分析框架——基于路由的队列路径分析(rPAC),该框架通过识别和评分路径中的单个路径,真正意义上使用了路径拓扑。根据给定队列中每个样本的转录组学数据,计算KEGG信号通路中所有信号通路和效应通路的活动分数和p值。队列中的总体路线活动是根据两个综合指标来评估的,“显著性比例”(PS)和“平均路线得分”(ARS)。基于大量模拟数据的系统评价表明,rPAC显著优于传统的路径分析方法。来自The Cancer Genome Atlas (TCGA) repository的三种上皮性癌症的案例研究表明,一些通路(如紧密连接、Th17细胞分化、脂肪细胞因子信号传导等)可以显著区分癌症类型,而其他与脂质代谢和脂肪细胞代谢相关的通路在不同的癌症中是共同调控的。虽然目前对癌症生物学的理解证实了大多数发现,但通过rPAC分析确定了许多以前未表征的机制,显示出对癌症表型产生新见解的潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信