UGM: a more stable procedure for large-scale multiple testing problems, new solutions to identify oncogene.

Q1 Mathematics
Chengyou Liu, Leilei Zhou, Yuhe Wang, Shuchang Tian, Junlin Zhu, Hang Qin, Yong Ding, Hongbing Jiang
{"title":"UGM: a more stable procedure for large-scale multiple testing problems, new solutions to identify oncogene.","authors":"Chengyou Liu,&nbsp;Leilei Zhou,&nbsp;Yuhe Wang,&nbsp;Shuchang Tian,&nbsp;Junlin Zhu,&nbsp;Hang Qin,&nbsp;Yong Ding,&nbsp;Hongbing Jiang","doi":"10.1186/s12976-019-0117-1","DOIUrl":null,"url":null,"abstract":"<p><p>Variations of gene expression levels play an important role in tumors. There are numerous methods to identify differentially expressed genes in high-throughput sequencing. Several algorithms endeavor to identify distinctive genetic patterns susceptable to particular diseases. Although these processes have been proved successful, the probability that the number of non-differentially expressed genes measured by false discovery rate (FDR) has a large standard deviation, and the misidentification rate (type I error) grows rapidly when the number of genes to be detected become larger. In this study we developed a new method, Unit Gamma Measurement (UGM), accounting for multiple hypotheses test statistics distribution, which could reduce the dependency problem. Simulated expression profile data and breast cancer RNA-Seq data were utilized to testify the accuracy of UGM. The results show that the number of non-differentially expressed genes identified by the UGM is very close to the real-evidence data, and the UGM also has a smaller standard error, range, quartile range and RMS error. In addition, the UGM can be used to screen many breast cancer-associated genes, such as BRCA1, BRCA2, PTEN, BRIP1, etc., provides better accuracy, robustness and efficiency, the method of identification differentially expressed genes in high-throughput sequencing.</p>","PeriodicalId":51195,"journal":{"name":"Theoretical Biology and Medical Modelling","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/s12976-019-0117-1","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Theoretical Biology and Medical Modelling","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s12976-019-0117-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 0

Abstract

Variations of gene expression levels play an important role in tumors. There are numerous methods to identify differentially expressed genes in high-throughput sequencing. Several algorithms endeavor to identify distinctive genetic patterns susceptable to particular diseases. Although these processes have been proved successful, the probability that the number of non-differentially expressed genes measured by false discovery rate (FDR) has a large standard deviation, and the misidentification rate (type I error) grows rapidly when the number of genes to be detected become larger. In this study we developed a new method, Unit Gamma Measurement (UGM), accounting for multiple hypotheses test statistics distribution, which could reduce the dependency problem. Simulated expression profile data and breast cancer RNA-Seq data were utilized to testify the accuracy of UGM. The results show that the number of non-differentially expressed genes identified by the UGM is very close to the real-evidence data, and the UGM also has a smaller standard error, range, quartile range and RMS error. In addition, the UGM can be used to screen many breast cancer-associated genes, such as BRCA1, BRCA2, PTEN, BRIP1, etc., provides better accuracy, robustness and efficiency, the method of identification differentially expressed genes in high-throughput sequencing.

Abstract Image

Abstract Image

UGM:一个更稳定的程序,大规模的多重测试问题,新的解决方案,以确定致癌基因。
基因表达水平的变化在肿瘤中起着重要的作用。在高通量测序中,有许多方法可以鉴定差异表达基因。一些算法试图识别易受特定疾病影响的独特遗传模式。虽然这些过程已被证明是成功的,但通过错误发现率(FDR)测量的非差异表达基因数量的概率具有较大的标准差,并且当待检测基因数量增加时,错误识别率(I型错误)迅速增长。在本研究中,我们提出了一种新的方法,单位伽马测量(UGM),该方法考虑了多假设检验统计分布,可以减少依赖问题。模拟表达谱数据和乳腺癌RNA-Seq数据验证了UGM的准确性。结果表明,该方法鉴定的非差异表达基因数量与实际证据数据非常接近,且具有较小的标准误差、极差、四分位极差和均方根误差。此外,UGM可用于筛选多种乳腺癌相关基因,如BRCA1、BRCA2、PTEN、BRIP1等,为鉴别差异表达基因的高通量测序方法提供了更好的准确性、稳健性和高效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Theoretical Biology and Medical Modelling
Theoretical Biology and Medical Modelling MATHEMATICAL & COMPUTATIONAL BIOLOGY-
自引率
0.00%
发文量
0
审稿时长
6-12 weeks
期刊介绍: Theoretical Biology and Medical Modelling is an open access peer-reviewed journal adopting a broad definition of "biology" and focusing on theoretical ideas and models associated with developments in biology and medicine. Mathematicians, biologists and clinicians of various specialisms, philosophers and historians of science are all contributing to the emergence of novel concepts in an age of systems biology, bioinformatics and computer modelling. This is the field in which Theoretical Biology and Medical Modelling operates. We welcome submissions that are technically sound and offering either improved understanding in biology and medicine or progress in theory or method.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信