昼夜转录组研究中次优设计的加权三角回归。

IF 1.8 4区 医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY
Michael T Gorczyca, Justice D Sefas
{"title":"昼夜转录组研究中次优设计的加权三角回归。","authors":"Michael T Gorczyca, Justice D Sefas","doi":"10.1002/sim.70201","DOIUrl":null,"url":null,"abstract":"<p><p>Circadian transcriptome studies often use trigonometric regression to model gene expression over time. Ideally, protocols in these studies would collect tissue samples at evenly distributed and equally spaced time points over a 24-hour period. This sample collection protocol is known as an equispaced design, which is considered the optimal experimental design for trigonometric regression under multiple statistical criteria. However, implementing equispaced designs in studies involving individuals is logistically challenging, and failure to employ an equispaced design could introduce variability in the statistical power of a hypothesis test relative to a model's phase-shift parameter estimates. This article is motivated by the variability in power for hypothesis testing when tissue samples are not collected under an equispaced design, and considers a weighted trigonometric regression as a remedy. Specifically, the weights for this regression are the normalized reciprocals of estimates derived from a kernel density estimator for sample collection time, which inflates the weight of samples collected at underrepresented time points. A search procedure is also introduced to identify the hyperparameter for kernel density estimation that relates to maximizing the smallest eigenvalue of the Hessian of weighted squared loss, which is motivated by the <math> <semantics><mrow><mi>E</mi></mrow> <annotation>$$ E $$</annotation></semantics> </math> -optimality criterion from experimental design literature. Simulation studies consistently demonstrate that this weighted regression mitigates variability in power for hypothesis tests performed with an estimated model. Illustrations with six circadian transcriptome datasets further indicate that this weighted regression consistently yields larger test statistics than its unweighted counterpart for first-order trigonometric regression, or cosinor regression, which is prevalent in circadian transcriptome studies.</p>","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":"44 20-22","pages":"e70201"},"PeriodicalIF":1.8000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Weighted Trigonometric Regression for Suboptimal Designs in Circadian Transcriptome Studies.\",\"authors\":\"Michael T Gorczyca, Justice D Sefas\",\"doi\":\"10.1002/sim.70201\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Circadian transcriptome studies often use trigonometric regression to model gene expression over time. Ideally, protocols in these studies would collect tissue samples at evenly distributed and equally spaced time points over a 24-hour period. This sample collection protocol is known as an equispaced design, which is considered the optimal experimental design for trigonometric regression under multiple statistical criteria. However, implementing equispaced designs in studies involving individuals is logistically challenging, and failure to employ an equispaced design could introduce variability in the statistical power of a hypothesis test relative to a model's phase-shift parameter estimates. This article is motivated by the variability in power for hypothesis testing when tissue samples are not collected under an equispaced design, and considers a weighted trigonometric regression as a remedy. Specifically, the weights for this regression are the normalized reciprocals of estimates derived from a kernel density estimator for sample collection time, which inflates the weight of samples collected at underrepresented time points. A search procedure is also introduced to identify the hyperparameter for kernel density estimation that relates to maximizing the smallest eigenvalue of the Hessian of weighted squared loss, which is motivated by the <math> <semantics><mrow><mi>E</mi></mrow> <annotation>$$ E $$</annotation></semantics> </math> -optimality criterion from experimental design literature. Simulation studies consistently demonstrate that this weighted regression mitigates variability in power for hypothesis tests performed with an estimated model. Illustrations with six circadian transcriptome datasets further indicate that this weighted regression consistently yields larger test statistics than its unweighted counterpart for first-order trigonometric regression, or cosinor regression, which is prevalent in circadian transcriptome studies.</p>\",\"PeriodicalId\":21879,\"journal\":{\"name\":\"Statistics in Medicine\",\"volume\":\"44 20-22\",\"pages\":\"e70201\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2025-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistics in Medicine\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1002/sim.70201\",\"RegionNum\":4,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistics in Medicine","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/sim.70201","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

昼夜转录组研究通常使用三角回归来模拟基因随时间的表达。理想情况下,这些研究的方案将在24小时内均匀分布和均匀间隔的时间点收集组织样本。这种样本收集方案被称为均衡设计,它被认为是在多个统计标准下三角回归的最佳实验设计。然而,在涉及个体的研究中实施均衡设计在逻辑上具有挑战性,并且未能采用均衡设计可能会引入相对于模型相移参数估计的假设检验的统计能力的可变性。本文的动机是当组织样本没有在均衡设计下收集时,假设检验的功率变异性,并考虑加权三角回归作为补救措施。具体来说,这种回归的权重是样本收集时间的核密度估计器得出的估计的归一化倒数,这增加了在代表性不足的时间点收集的样本的权重。基于实验设计文献中的E $$ E $$ -最优准则,引入了一种搜索程序来识别核密度估计的超参数,该超参数涉及加权平方损失的Hessian最小特征值的最大化。模拟研究一致表明,这种加权回归减轻了用估计模型进行的假设检验的功率变异性。六个昼夜节律转录组数据集的插图进一步表明,这种加权回归始终比一阶三角回归或余弦回归的未加权回归产生更大的检验统计量,这在昼夜节律转录组研究中很普遍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Weighted Trigonometric Regression for Suboptimal Designs in Circadian Transcriptome Studies.

Circadian transcriptome studies often use trigonometric regression to model gene expression over time. Ideally, protocols in these studies would collect tissue samples at evenly distributed and equally spaced time points over a 24-hour period. This sample collection protocol is known as an equispaced design, which is considered the optimal experimental design for trigonometric regression under multiple statistical criteria. However, implementing equispaced designs in studies involving individuals is logistically challenging, and failure to employ an equispaced design could introduce variability in the statistical power of a hypothesis test relative to a model's phase-shift parameter estimates. This article is motivated by the variability in power for hypothesis testing when tissue samples are not collected under an equispaced design, and considers a weighted trigonometric regression as a remedy. Specifically, the weights for this regression are the normalized reciprocals of estimates derived from a kernel density estimator for sample collection time, which inflates the weight of samples collected at underrepresented time points. A search procedure is also introduced to identify the hyperparameter for kernel density estimation that relates to maximizing the smallest eigenvalue of the Hessian of weighted squared loss, which is motivated by the E $$ E $$ -optimality criterion from experimental design literature. Simulation studies consistently demonstrate that this weighted regression mitigates variability in power for hypothesis tests performed with an estimated model. Illustrations with six circadian transcriptome datasets further indicate that this weighted regression consistently yields larger test statistics than its unweighted counterpart for first-order trigonometric regression, or cosinor regression, which is prevalent in circadian transcriptome studies.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Statistics in Medicine
Statistics in Medicine 医学-公共卫生、环境卫生与职业卫生
CiteScore
3.40
自引率
10.00%
发文量
334
审稿时长
2-4 weeks
期刊介绍: The journal aims to influence practice in medicine and its associated sciences through the publication of papers on statistical and other quantitative methods. Papers will explain new methods and demonstrate their application, preferably through a substantive, real, motivating example or a comprehensive evaluation based on an illustrative example. Alternatively, papers will report on case-studies where creative use or technical generalizations of established methodology is directed towards a substantive application. Reviews of, and tutorials on, general topics relevant to the application of statistics to medicine will also be published. The main criteria for publication are appropriateness of the statistical methods to a particular medical problem and clarity of exposition. Papers with primarily mathematical content will be excluded. The journal aims to enhance communication between statisticians, clinicians and medical researchers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信