Count-based transcriptome analysis to identify differentially expressed genes for breast cancer

2016 International Conference on Bioinformatics and Systems Biology (BSB) Pub Date : 2016-03-04 DOI:10.1109/BSB.2016.7552147

R. Tripathi, Pawan Sharma, P. Chakraborty, P. Varadwaj

{"title":"Count-based transcriptome analysis to identify differentially expressed genes for breast cancer","authors":"R. Tripathi, Pawan Sharma, P. Chakraborty, P. Varadwaj","doi":"10.1109/BSB.2016.7552147","DOIUrl":null,"url":null,"abstract":"Sequencing the coding regions or the whole cancer transcriptome can provide valuable information about the differential expression patterns of the genes. Previous researches centered on ~2% of coding human genome, assuming that the non-coding sequences were “junk” lacking significant functional information. Recent medical research show that a major percentage of the human genome (~70-90%) are non-coding, stored in the cell in the form of non-coding RNA (ncRNA) which overshadows the coding information limited only to a small percentage. These ncRNAs are composed of mostly ultraconserved elements, lacking protein-coding potential and regulating gene expression acting as enhancers whose aberrant expression may be involved in pathological process such as cancer. Here, we have described RNA-seq data analysis for the profiling of transcriptome of Breast cells and provided a generic outline of the whole pipeline from next-generation sequencing (NGS) output for quantification of differential gene expression across different conditions (e.g., control vs test). We have used tool Cufflinks-Cuffdiff to estimate transcript-level expression for gene discovery extracted from high-throughput RNA-seq data across distinct conditions that represent candidate biomarkers for future research. This study provides the survey of coding transcripts associated genes expression within a cancer system.","PeriodicalId":363820,"journal":{"name":"2016 International Conference on Bioinformatics and Systems Biology (BSB)","volume":"454 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Bioinformatics and Systems Biology (BSB)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BSB.2016.7552147","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Sequencing the coding regions or the whole cancer transcriptome can provide valuable information about the differential expression patterns of the genes. Previous researches centered on ~2% of coding human genome, assuming that the non-coding sequences were “junk” lacking significant functional information. Recent medical research show that a major percentage of the human genome (~70-90%) are non-coding, stored in the cell in the form of non-coding RNA (ncRNA) which overshadows the coding information limited only to a small percentage. These ncRNAs are composed of mostly ultraconserved elements, lacking protein-coding potential and regulating gene expression acting as enhancers whose aberrant expression may be involved in pathological process such as cancer. Here, we have described RNA-seq data analysis for the profiling of transcriptome of Breast cells and provided a generic outline of the whole pipeline from next-generation sequencing (NGS) output for quantification of differential gene expression across different conditions (e.g., control vs test). We have used tool Cufflinks-Cuffdiff to estimate transcript-level expression for gene discovery extracted from high-throughput RNA-seq data across distinct conditions that represent candidate biomarkers for future research. This study provides the survey of coding transcripts associated genes expression within a cancer system.

查看原文本刊更多论文

基于计数的转录组分析鉴定乳腺癌差异表达基因

对编码区或整个肿瘤转录组进行测序可以提供有关基因差异表达模式的有价值信息。以往的研究集中在约2%的编码人类基因组上，假设非编码序列是缺乏重要功能信息的“垃圾”。最近的医学研究表明，人类基因组的很大一部分(约70-90%)是非编码的，它们以非编码RNA (ncRNA)的形式储存在细胞中，掩盖了只有一小部分编码信息。这些ncrna大多由超保守元件组成，缺乏蛋白质编码潜能，作为增强子调节基因表达，其异常表达可能参与癌症等病理过程。在这里，我们描述了用于乳腺细胞转录组分析的RNA-seq数据分析，并提供了下一代测序(NGS)输出的整个管道的总体轮廓，用于量化不同条件下(例如，对照与测试)的差异基因表达。我们使用Cufflinks-Cuffdiff工具来估计从不同条件下的高通量RNA-seq数据中提取的基因发现的转录水平表达，这些数据代表了未来研究的候选生物标志物。本研究提供了癌症系统中编码转录本相关基因表达的调查。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 International Conference on Bioinformatics and Systems Biology (BSB)

自引率

0.00%

发文量