Gene count estimation with pytximport enables reproducible analysis of bulk RNA sequencing data in Python.

Malte Kuehl, Milagros N Wong, Nicola Wanner, Stefan Bonn, Victor G Puelles
{"title":"Gene count estimation with pytximport enables reproducible analysis of bulk RNA sequencing data in Python.","authors":"Malte Kuehl, Milagros N Wong, Nicola Wanner, Stefan Bonn, Victor G Puelles","doi":"10.1093/bioinformatics/btae700","DOIUrl":null,"url":null,"abstract":"<p><strong>Summary: </strong>Transcript quantification tools efficiently map bulk RNA sequencing reads to reference transcriptomes. However, their output consists of transcript count estimates that are subject to multiple biases and cannot be readily used with existing differential gene expression analysis tools in Python.Here we present pytximport, a Python implementation of the tximport R package that supports a variety of input formats, different modes of bias correction, inferential replicates, gene-level summarization of transcript counts, transcript-level exports, transcript-to-gene mapping generation and optional filtering of transcripts by biotype. pytximport is part of the scverse ecosystem of open-source Python software packages for omics analyses and includes both a Python as well as a command-line interface.With pytximport, we propose a bulk RNA sequencing analysis workflow based on Bioconda and scverse ecosystem packages, ensuring reproducible analyses through Snakemake rules. We apply this pipeline to a publicly available RNA-sequencing dataset, demonstrating how pytximport enables the creation of Python-centric workflows capable of providing insights into transcriptomic alterations.</p><p><strong>Availability: </strong>pytximport is licensed under the GNU General Public License version 3. The source code is available at https://github.com/complextissue/pytximport and via Zenodo with DOI: 10.5281/zenodo.13907917. A related Snakemake workflow is available through GitHub at https://github.com/complextissue/snakemake-bulk-rna-seq-workflow and Zenodo with DOI: 10.5281/zenodo.12713811. Documentation and a vignette for new users are available at: https://pytximport.readthedocs.io.</p><p><strong>Supplementary information: </strong>Supplementary Material is available at Bioinformatics online.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btae700","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Summary: Transcript quantification tools efficiently map bulk RNA sequencing reads to reference transcriptomes. However, their output consists of transcript count estimates that are subject to multiple biases and cannot be readily used with existing differential gene expression analysis tools in Python.Here we present pytximport, a Python implementation of the tximport R package that supports a variety of input formats, different modes of bias correction, inferential replicates, gene-level summarization of transcript counts, transcript-level exports, transcript-to-gene mapping generation and optional filtering of transcripts by biotype. pytximport is part of the scverse ecosystem of open-source Python software packages for omics analyses and includes both a Python as well as a command-line interface.With pytximport, we propose a bulk RNA sequencing analysis workflow based on Bioconda and scverse ecosystem packages, ensuring reproducible analyses through Snakemake rules. We apply this pipeline to a publicly available RNA-sequencing dataset, demonstrating how pytximport enables the creation of Python-centric workflows capable of providing insights into transcriptomic alterations.

Availability: pytximport is licensed under the GNU General Public License version 3. The source code is available at https://github.com/complextissue/pytximport and via Zenodo with DOI: 10.5281/zenodo.13907917. A related Snakemake workflow is available through GitHub at https://github.com/complextissue/snakemake-bulk-rna-seq-workflow and Zenodo with DOI: 10.5281/zenodo.12713811. Documentation and a vignette for new users are available at: https://pytximport.readthedocs.io.

Supplementary information: Supplementary Material is available at Bioinformatics online.

利用 pytximport 估算基因数量,可在 Python 中对大量 RNA 测序数据进行可重现的分析。
摘要:转录本定量工具能有效地将大量 RNA 测序读数映射到参考转录组。我们在此介绍 pytximport,它是 tximport R 软件包的 Python 实现,支持多种输入格式、不同的偏差校正模式、推理重复、转录本计数的基因级汇总、转录本级导出、转录本到基因映射生成以及可选的生物型转录本过滤。pytximport是用于omics分析的开源Python软件包scverse生态系统的一部分,包括一个Python和一个命令行界面。通过pytximport,我们提出了一个基于Bioconda和scverse生态系统软件包的批量RNA测序分析工作流,通过Snakemake规则确保分析的可重复性。我们将这一流程应用于一个公开的 RNA 测序数据集,展示了 pytximport 如何创建以 Python 为中心的工作流,从而深入了解转录组的变化。源代码可从 https://github.com/complextissue/pytximport 获取,也可通过 Zenodo 获取,DOI:10.5281/zenodo.13907917。相关的 Snakemake 工作流程可通过 GitHub https://github.com/complextissue/snakemake-bulk-rna-seq-workflow 和 Zenodo 获取,DOI:10.5281/zenodo.12713811。文档和面向新用户的小节可在以下网址获取:https://pytximport.readthedocs.io.Supplementary information:补充材料可在 Bioinformatics online 上获取。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信