微阵列数据应该在什么尺度上进行分析?

Shuguang Huang, Adeline A Yeo, Lawrence Gelbert, Xi Lin, Laura Nisenbaum, Kerry G Bemis
{"title":"微阵列数据应该在什么尺度上进行分析?","authors":"Shuguang Huang,&nbsp;Adeline A Yeo,&nbsp;Lawrence Gelbert,&nbsp;Xi Lin,&nbsp;Laura Nisenbaum,&nbsp;Kerry G Bemis","doi":"10.2165/00129785-200404020-00007","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>The hybridization intensities derived from microarray experiments, for example Affymetrix's MAS5 signals, are very often transformed in one way or another before statistical models are fitted. The motivation for performing transformation is usually to satisfy the model assumptions such as normality and homogeneity in variance. Generally speaking, two types of strategies are often applied to microarray data depending on the analysis need: correlation analysis where all the gene intensities on the array are considered simultaneously, and gene-by-gene ANOVA where each gene is analyzed individually.</p><p><strong>Aim: </strong>We investigate the distributional properties of the Affymetrix GeneChip signal data under the two scenarios, focusing on the impact of analyzing the data at an inappropriate scale.</p><p><strong>Methods: </strong>The Box-Cox type of transformation is first investigated for the strategy of pooling genes. The commonly used log-transformation is particularly applied for comparison purposes. For the scenario where analysis is on a gene-by-gene basis, the model assumptions such as normality are explored. The impact of using a wrong scale is illustrated by log-transformation and quartic-root transformation.</p><p><strong>Results: </strong>When all the genes on the array are considered together, the dependent relationship between the expression and its variation level can be satisfactorily removed by Box-Cox transformation. When genes are analyzed individually, the distributional properties of the intensities are shown to be gene dependent. Derivation and simulation show that some loss of power is incurred when a wrong scale is used, but due to the robustness of the t-test, the loss is acceptable when the fold-change is not very large.</p>","PeriodicalId":72171,"journal":{"name":"American journal of pharmacogenomics : genomics-related research in drug development and clinical practice","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2004-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2165/00129785-200404020-00007","citationCount":"11","resultStr":"{\"title\":\"At what scale should microarray data be analyzed?\",\"authors\":\"Shuguang Huang,&nbsp;Adeline A Yeo,&nbsp;Lawrence Gelbert,&nbsp;Xi Lin,&nbsp;Laura Nisenbaum,&nbsp;Kerry G Bemis\",\"doi\":\"10.2165/00129785-200404020-00007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Introduction: </strong>The hybridization intensities derived from microarray experiments, for example Affymetrix's MAS5 signals, are very often transformed in one way or another before statistical models are fitted. The motivation for performing transformation is usually to satisfy the model assumptions such as normality and homogeneity in variance. Generally speaking, two types of strategies are often applied to microarray data depending on the analysis need: correlation analysis where all the gene intensities on the array are considered simultaneously, and gene-by-gene ANOVA where each gene is analyzed individually.</p><p><strong>Aim: </strong>We investigate the distributional properties of the Affymetrix GeneChip signal data under the two scenarios, focusing on the impact of analyzing the data at an inappropriate scale.</p><p><strong>Methods: </strong>The Box-Cox type of transformation is first investigated for the strategy of pooling genes. The commonly used log-transformation is particularly applied for comparison purposes. For the scenario where analysis is on a gene-by-gene basis, the model assumptions such as normality are explored. The impact of using a wrong scale is illustrated by log-transformation and quartic-root transformation.</p><p><strong>Results: </strong>When all the genes on the array are considered together, the dependent relationship between the expression and its variation level can be satisfactorily removed by Box-Cox transformation. When genes are analyzed individually, the distributional properties of the intensities are shown to be gene dependent. Derivation and simulation show that some loss of power is incurred when a wrong scale is used, but due to the robustness of the t-test, the loss is acceptable when the fold-change is not very large.</p>\",\"PeriodicalId\":72171,\"journal\":{\"name\":\"American journal of pharmacogenomics : genomics-related research in drug development and clinical practice\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2004-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.2165/00129785-200404020-00007\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"American journal of pharmacogenomics : genomics-related research in drug development and clinical practice\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2165/00129785-200404020-00007\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of pharmacogenomics : genomics-related research in drug development and clinical practice","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2165/00129785-200404020-00007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

摘要

从微阵列实验中得到的杂交强度,例如Affymetrix的MAS5信号,在拟合统计模型之前经常以一种方式或另一种方式进行转换。执行转换的动机通常是为了满足模型假设,例如方差的正态性和同质性。一般来说,根据分析需要,两种类型的策略通常应用于微阵列数据:相关性分析(同时考虑阵列上所有基因强度)和逐基因方差分析(每个基因单独分析)。目的:研究两种情况下Affymetrix GeneChip信号数据的分布特性,重点研究不适当尺度下数据分析的影响。方法:首先对Box-Cox型转化进行基因池化策略研究。常用的对数变换特别适用于比较目的。对于在逐个基因的基础上进行分析的场景,探索了模型假设,例如正态性。用对数变换和四次根变换说明了使用错误标度的影响。结果:将阵列上的所有基因综合考虑,Box-Cox转化可以很好地去除表达与其变异水平之间的依赖关系。当基因单独分析时,强度的分布特性显示为基因依赖。推导和仿真表明,当使用错误的尺度时,会产生一些功率损失,但由于t检验的稳健性,当折叠变化不是很大时,损失是可以接受的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
At what scale should microarray data be analyzed?

Introduction: The hybridization intensities derived from microarray experiments, for example Affymetrix's MAS5 signals, are very often transformed in one way or another before statistical models are fitted. The motivation for performing transformation is usually to satisfy the model assumptions such as normality and homogeneity in variance. Generally speaking, two types of strategies are often applied to microarray data depending on the analysis need: correlation analysis where all the gene intensities on the array are considered simultaneously, and gene-by-gene ANOVA where each gene is analyzed individually.

Aim: We investigate the distributional properties of the Affymetrix GeneChip signal data under the two scenarios, focusing on the impact of analyzing the data at an inappropriate scale.

Methods: The Box-Cox type of transformation is first investigated for the strategy of pooling genes. The commonly used log-transformation is particularly applied for comparison purposes. For the scenario where analysis is on a gene-by-gene basis, the model assumptions such as normality are explored. The impact of using a wrong scale is illustrated by log-transformation and quartic-root transformation.

Results: When all the genes on the array are considered together, the dependent relationship between the expression and its variation level can be satisfactorily removed by Box-Cox transformation. When genes are analyzed individually, the distributional properties of the intensities are shown to be gene dependent. Derivation and simulation show that some loss of power is incurred when a wrong scale is used, but due to the robustness of the t-test, the loss is acceptable when the fold-change is not very large.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信