Investigation of normalization procedures for transcriptome profiles of compounds oriented toward practical study design.

IF 1.8 4区 医学 Q4 TOXICOLOGY
Tadahaya Mizuno, Hiroyuki Kusuhara
{"title":"Investigation of normalization procedures for transcriptome profiles of compounds oriented toward practical study design.","authors":"Tadahaya Mizuno, Hiroyuki Kusuhara","doi":"10.2131/jts.49.249","DOIUrl":null,"url":null,"abstract":"<p><p>The transcriptome profile is a representative phenotype-based descriptor of compounds, widely acknowledged for its ability to effectively capture compound effects. However, the presence of batch differences is inevitable. Despite the existence of sophisticated statistical methods, many of them presume a substantial sample size. How should we design a transcriptome analysis to obtain robust compound profiles, particularly in the context of small datasets frequently encountered in practical scenarios? This study addresses this question by investigating the normalization procedures for transcriptome profiles, focusing on the baseline distribution employed in deriving biological responses as profiles. Firstly, we investigated two large GeneChip datasets, comparing the impact of different normalization procedures. Through an evaluation of the similarity between response profiles of biological replicates within each dataset and the similarity between response profiles of the same compound across datasets, we revealed that the baseline distribution defined by all samples within each batch under batch-corrected condition is a good choice for large datasets. Subsequently, we conducted a simulation to explore the influence of the number of control samples on the robustness of response profiles across datasets. The results offer insights into determining the suitable quantity of control samples for diminutive datasets. It is crucial to acknowledge that these conclusions stem from constrained datasets. Nevertheless, we believe that this study enhances our understanding of how to effectively leverage transcriptome profiles of compounds and promotes the accumulation of essential knowledge for the practical application of such profiles.</p>","PeriodicalId":17654,"journal":{"name":"Journal of Toxicological Sciences","volume":"49 6","pages":"249-259"},"PeriodicalIF":1.8000,"publicationDate":"2024-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Toxicological Sciences","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2131/jts.49.249","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"TOXICOLOGY","Score":null,"Total":0}
引用次数: 0

Abstract

The transcriptome profile is a representative phenotype-based descriptor of compounds, widely acknowledged for its ability to effectively capture compound effects. However, the presence of batch differences is inevitable. Despite the existence of sophisticated statistical methods, many of them presume a substantial sample size. How should we design a transcriptome analysis to obtain robust compound profiles, particularly in the context of small datasets frequently encountered in practical scenarios? This study addresses this question by investigating the normalization procedures for transcriptome profiles, focusing on the baseline distribution employed in deriving biological responses as profiles. Firstly, we investigated two large GeneChip datasets, comparing the impact of different normalization procedures. Through an evaluation of the similarity between response profiles of biological replicates within each dataset and the similarity between response profiles of the same compound across datasets, we revealed that the baseline distribution defined by all samples within each batch under batch-corrected condition is a good choice for large datasets. Subsequently, we conducted a simulation to explore the influence of the number of control samples on the robustness of response profiles across datasets. The results offer insights into determining the suitable quantity of control samples for diminutive datasets. It is crucial to acknowledge that these conclusions stem from constrained datasets. Nevertheless, we believe that this study enhances our understanding of how to effectively leverage transcriptome profiles of compounds and promotes the accumulation of essential knowledge for the practical application of such profiles.

以实用研究设计为导向的化合物转录组图谱归一化程序研究。
转录组图谱是基于表型的化合物代表性描述指标,因其能够有效捕捉化合物效应而得到广泛认可。然而,批次差异的存在是不可避免的。尽管存在复杂的统计方法,但其中许多方法都以大量样本为前提。我们应该如何设计转录组分析,以获得可靠的化合物概况,尤其是在实际应用中经常遇到的小数据集的情况下?本研究通过研究转录组图谱的归一化程序来解决这个问题,重点是在将生物反应推导为图谱时所采用的基线分布。首先,我们研究了两个大型基因芯片数据集,比较了不同归一化程序的影响。通过评估每个数据集中生物复制的反应谱之间的相似性以及不同数据集中同一化合物的反应谱之间的相似性,我们发现在批次校正条件下,由每批内所有样本定义的基线分布是大型数据集的良好选择。随后,我们进行了模拟,探索对照样本数量对跨数据集反应曲线稳健性的影响。结果为确定小型数据集的适当控制样本数量提供了启示。必须承认的是,这些结论是在数据集受到限制的情况下得出的。尽管如此,我们相信这项研究加深了我们对如何有效利用化合物转录组图谱的理解,并促进了此类图谱实际应用所需的基本知识的积累。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
3.20
自引率
5.00%
发文量
53
审稿时长
4-8 weeks
期刊介绍: The Journal of Toxicological Sciences (J. Toxicol. Sci.) is a scientific journal that publishes research about the mechanisms and significance of the toxicity of substances, such as drugs, food additives, food contaminants and environmental pollutants. Papers on the toxicities and effects of extracts and mixtures containing unidentified compounds cannot be accepted as a general rule.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信