RCPA:用于数据处理、差异分析、共识途径分析和可视化的开源 R 软件包

Hung Nguyen, Ha Nguyen, Zeynab Maghsoudi, Bang Tran, Sorin Draghici, Tin Nguyen
{"title":"RCPA:用于数据处理、差异分析、共识途径分析和可视化的开源 R 软件包","authors":"Hung Nguyen,&nbsp;Ha Nguyen,&nbsp;Zeynab Maghsoudi,&nbsp;Bang Tran,&nbsp;Sorin Draghici,&nbsp;Tin Nguyen","doi":"10.1002/cpz1.1036","DOIUrl":null,"url":null,"abstract":"<p>Identifying impacted pathways is important because it provides insights into the biology underlying conditions beyond the detection of differentially expressed genes. Because of the importance of such analysis, more than 100 pathway analysis methods have been developed thus far. Despite the availability of many methods, it is challenging for biomedical researchers to learn and properly perform pathway analysis. First, the sheer number of methods makes it challenging to learn and choose the correct method for a given experiment. Second, computational methods require users to be savvy with coding syntax, and comfortable with command-line environments, areas that are unfamiliar to most life scientists. Third, as learning tools and computational methods are typically implemented only for a few species (i.e., human and some model organisms), it is difficult to perform pathway analysis on other species that are not included in many of the current pathway analysis tools. Finally, existing pathway tools do not allow researchers to combine, compare, and contrast the results of different methods and experiments for both hypothesis testing and analysis purposes. To address these challenges, we developed an open-source R package for Consensus Pathway Analysis (RCPA) that allows researchers to conveniently: (1) download and process data from NCBI GEO; (2) perform differential analysis using established techniques developed for both microarray and sequencing data; (3) perform both gene set enrichment, as well as topology-based pathway analysis using different methods that seek to answer different research hypotheses; (4) combine methods and datasets to find consensus results; and (5) visualize analysis results and explore significantly impacted pathways across multiple analyses. This protocol provides many example code snippets with detailed explanations and supports the analysis of more than 1000 species, two pathway databases, three differential analysis techniques, eight pathway analysis tools, six meta-analysis methods, and two consensus analysis techniques. The package is freely available on the CRAN repository. © 2024 The Authors. Current Protocols published by Wiley Periodicals LLC.</p><p><b>Basic Protocol 1</b>: Processing Affymetrix microarrays</p><p><b>Basic Protocol 2</b>: Processing Agilent microarrays</p><p><b>Support Protocol</b>: Processing RNA sequencing (RNA-Seq) data</p><p><b>Basic Protocol 3</b>: Differential analysis of microarray data (Affymetrix and Agilent)</p><p><b>Basic Protocol 4</b>: Differential analysis of RNA-Seq data</p><p><b>Basic Protocol 5</b>: Gene set enrichment analysis</p><p><b>Basic Protocol 6</b>: Topology-based (TB) pathway analysis</p><p><b>Basic Protocol 7</b>: Data integration and visualization</p>","PeriodicalId":93970,"journal":{"name":"Current protocols","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cpz1.1036","citationCount":"0","resultStr":"{\"title\":\"RCPA: An Open-Source R Package for Data Processing, Differential Analysis, Consensus Pathway Analysis, and Visualization\",\"authors\":\"Hung Nguyen,&nbsp;Ha Nguyen,&nbsp;Zeynab Maghsoudi,&nbsp;Bang Tran,&nbsp;Sorin Draghici,&nbsp;Tin Nguyen\",\"doi\":\"10.1002/cpz1.1036\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Identifying impacted pathways is important because it provides insights into the biology underlying conditions beyond the detection of differentially expressed genes. Because of the importance of such analysis, more than 100 pathway analysis methods have been developed thus far. Despite the availability of many methods, it is challenging for biomedical researchers to learn and properly perform pathway analysis. First, the sheer number of methods makes it challenging to learn and choose the correct method for a given experiment. Second, computational methods require users to be savvy with coding syntax, and comfortable with command-line environments, areas that are unfamiliar to most life scientists. Third, as learning tools and computational methods are typically implemented only for a few species (i.e., human and some model organisms), it is difficult to perform pathway analysis on other species that are not included in many of the current pathway analysis tools. Finally, existing pathway tools do not allow researchers to combine, compare, and contrast the results of different methods and experiments for both hypothesis testing and analysis purposes. To address these challenges, we developed an open-source R package for Consensus Pathway Analysis (RCPA) that allows researchers to conveniently: (1) download and process data from NCBI GEO; (2) perform differential analysis using established techniques developed for both microarray and sequencing data; (3) perform both gene set enrichment, as well as topology-based pathway analysis using different methods that seek to answer different research hypotheses; (4) combine methods and datasets to find consensus results; and (5) visualize analysis results and explore significantly impacted pathways across multiple analyses. This protocol provides many example code snippets with detailed explanations and supports the analysis of more than 1000 species, two pathway databases, three differential analysis techniques, eight pathway analysis tools, six meta-analysis methods, and two consensus analysis techniques. The package is freely available on the CRAN repository. © 2024 The Authors. Current Protocols published by Wiley Periodicals LLC.</p><p><b>Basic Protocol 1</b>: Processing Affymetrix microarrays</p><p><b>Basic Protocol 2</b>: Processing Agilent microarrays</p><p><b>Support Protocol</b>: Processing RNA sequencing (RNA-Seq) data</p><p><b>Basic Protocol 3</b>: Differential analysis of microarray data (Affymetrix and Agilent)</p><p><b>Basic Protocol 4</b>: Differential analysis of RNA-Seq data</p><p><b>Basic Protocol 5</b>: Gene set enrichment analysis</p><p><b>Basic Protocol 6</b>: Topology-based (TB) pathway analysis</p><p><b>Basic Protocol 7</b>: Data integration and visualization</p>\",\"PeriodicalId\":93970,\"journal\":{\"name\":\"Current protocols\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-05-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cpz1.1036\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Current protocols\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/cpz1.1036\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current protocols","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpz1.1036","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

识别受影响的通路非常重要,因为除了检测差异表达的基因外,它还能让我们深入了解疾病背后的生物学原理。由于此类分析的重要性,迄今已开发出 100 多种通路分析方法。尽管有很多方法,但对于生物医学研究人员来说,学习和正确进行通路分析仍具有挑战性。首先,由于方法数量庞大,学习并为特定实验选择正确的方法具有挑战性。其次,计算方法要求用户精通编码语法,熟悉命令行环境,而这些对于大多数生命科学家来说都是陌生的。第三,由于学习工具和计算方法通常只针对少数物种(如人类和一些模式生物),因此很难对其他物种进行通路分析,而目前的许多通路分析工具都不包括这些物种。最后,现有的通路工具不允许研究人员将不同方法和实验的结果进行组合、比较和对比,以达到假设检验和分析的目的。为了应对这些挑战,我们开发了一个用于共识通路分析(RCPA)的开源 R 软件包,它能让研究人员方便地:(1)从 NCBI GEO 下载和处理数据;(2)使用针对微阵列和测序数据开发的成熟技术进行差异分析;(3)使用不同的方法进行基因组富集和基于拓扑的通路分析,以回答不同的研究假设;(4)结合不同的方法和数据集以找到共识结果;以及(5)可视化分析结果,并在多个分析中探索受显著影响的通路。该协议提供了许多附有详细解释的示例代码片段,支持对 1000 多个物种、两个通路数据库、三种差异分析技术、八种通路分析工具、六种荟萃分析方法和两种共识分析技术进行分析。该软件包可在 CRAN 存储库中免费获取。© 2024 作者。基本协议 1:处理 Affymetrix 微阵列基本协议 2:处理 Agilent 微阵列支持协议:基本协议 3:微阵列数据(Affymetrix 和 Agilent)的差异分析基本协议 4:RNA-Seq 数据的差异分析基本协议 5:基因组富集分析基本协议 6:基于拓扑(TB)的通路分析基本协议 7:数据整合与可视化
本文章由计算机程序翻译,如有差异,请以英文原文为准。

RCPA: An Open-Source R Package for Data Processing, Differential Analysis, Consensus Pathway Analysis, and Visualization

RCPA: An Open-Source R Package for Data Processing, Differential Analysis, Consensus Pathway Analysis, and Visualization

Identifying impacted pathways is important because it provides insights into the biology underlying conditions beyond the detection of differentially expressed genes. Because of the importance of such analysis, more than 100 pathway analysis methods have been developed thus far. Despite the availability of many methods, it is challenging for biomedical researchers to learn and properly perform pathway analysis. First, the sheer number of methods makes it challenging to learn and choose the correct method for a given experiment. Second, computational methods require users to be savvy with coding syntax, and comfortable with command-line environments, areas that are unfamiliar to most life scientists. Third, as learning tools and computational methods are typically implemented only for a few species (i.e., human and some model organisms), it is difficult to perform pathway analysis on other species that are not included in many of the current pathway analysis tools. Finally, existing pathway tools do not allow researchers to combine, compare, and contrast the results of different methods and experiments for both hypothesis testing and analysis purposes. To address these challenges, we developed an open-source R package for Consensus Pathway Analysis (RCPA) that allows researchers to conveniently: (1) download and process data from NCBI GEO; (2) perform differential analysis using established techniques developed for both microarray and sequencing data; (3) perform both gene set enrichment, as well as topology-based pathway analysis using different methods that seek to answer different research hypotheses; (4) combine methods and datasets to find consensus results; and (5) visualize analysis results and explore significantly impacted pathways across multiple analyses. This protocol provides many example code snippets with detailed explanations and supports the analysis of more than 1000 species, two pathway databases, three differential analysis techniques, eight pathway analysis tools, six meta-analysis methods, and two consensus analysis techniques. The package is freely available on the CRAN repository. © 2024 The Authors. Current Protocols published by Wiley Periodicals LLC.

Basic Protocol 1: Processing Affymetrix microarrays

Basic Protocol 2: Processing Agilent microarrays

Support Protocol: Processing RNA sequencing (RNA-Seq) data

Basic Protocol 3: Differential analysis of microarray data (Affymetrix and Agilent)

Basic Protocol 4: Differential analysis of RNA-Seq data

Basic Protocol 5: Gene set enrichment analysis

Basic Protocol 6: Topology-based (TB) pathway analysis

Basic Protocol 7: Data integration and visualization

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.00
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信