协调异构转录组学数据集,进行基于机器学习的分析,以识别太空飞行小鼠肝脏特异性变化。

IF 4.4 1区 物理与天体物理 Q1 MULTIDISCIPLINARY SCIENCES
Hari Ilangovan, Prachi Kothiyal, Katherine A Hoadley, Robin Elgart, Greg Eley, Parastou Eslami
{"title":"协调异构转录组学数据集,进行基于机器学习的分析,以识别太空飞行小鼠肝脏特异性变化。","authors":"Hari Ilangovan, Prachi Kothiyal, Katherine A Hoadley, Robin Elgart, Greg Eley, Parastou Eslami","doi":"10.1038/s41526-024-00379-3","DOIUrl":null,"url":null,"abstract":"<p><p>NASA has employed high-throughput molecular assays to identify sub-cellular changes impacting human physiology during spaceflight. Machine learning (ML) methods hold the promise to improve our ability to identify important signals within highly dimensional molecular data. However, the inherent limitation of study subject numbers within a spaceflight mission minimizes the utility of ML approaches. To overcome the sample power limitations, data from multiple spaceflight missions must be aggregated while appropriately addressing intra- and inter-study variabilities. Here we describe an approach to log transform, scale and normalize data from six heterogeneous, mouse liver-derived transcriptomics datasets (n<sub>total </sub>= 137) which enabled ML-methods to classify spaceflown vs. ground control animals (AUC ≥ 0.87) while mitigating the variability from mission-of-origin. Concordance was found between liver-specific biological processes identified from harmonized ML-based analysis and study-by-study classical omics analysis. This work demonstrates the feasibility of applying ML methods on integrated, heterogeneous datasets of small sample size.</p>","PeriodicalId":54263,"journal":{"name":"npj Microgravity","volume":null,"pages":null},"PeriodicalIF":4.4000,"publicationDate":"2024-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11167036/pdf/","citationCount":"0","resultStr":"{\"title\":\"Harmonizing heterogeneous transcriptomics datasets for machine learning-based analysis to identify spaceflown murine liver-specific changes.\",\"authors\":\"Hari Ilangovan, Prachi Kothiyal, Katherine A Hoadley, Robin Elgart, Greg Eley, Parastou Eslami\",\"doi\":\"10.1038/s41526-024-00379-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>NASA has employed high-throughput molecular assays to identify sub-cellular changes impacting human physiology during spaceflight. Machine learning (ML) methods hold the promise to improve our ability to identify important signals within highly dimensional molecular data. However, the inherent limitation of study subject numbers within a spaceflight mission minimizes the utility of ML approaches. To overcome the sample power limitations, data from multiple spaceflight missions must be aggregated while appropriately addressing intra- and inter-study variabilities. Here we describe an approach to log transform, scale and normalize data from six heterogeneous, mouse liver-derived transcriptomics datasets (n<sub>total </sub>= 137) which enabled ML-methods to classify spaceflown vs. ground control animals (AUC ≥ 0.87) while mitigating the variability from mission-of-origin. Concordance was found between liver-specific biological processes identified from harmonized ML-based analysis and study-by-study classical omics analysis. This work demonstrates the feasibility of applying ML methods on integrated, heterogeneous datasets of small sample size.</p>\",\"PeriodicalId\":54263,\"journal\":{\"name\":\"npj Microgravity\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2024-06-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11167036/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"npj Microgravity\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1038/s41526-024-00379-3\",\"RegionNum\":1,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"npj Microgravity","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1038/s41526-024-00379-3","RegionNum":1,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

美国国家航空航天局(NASA)采用了高通量分子测定法来识别影响航天期间人体生理的亚细胞变化。机器学习(ML)方法有望提高我们在高维分子数据中识别重要信号的能力。然而,太空飞行任务中研究对象数量的固有限制使 ML 方法的实用性降至最低。为了克服样本能力的限制,必须对来自多个航天飞行任务的数据进行汇总,同时适当处理研究内部和研究之间的差异。在这里,我们描述了一种对来自六个异构小鼠肝脏转录组学数据集(总计 = 137)的数据进行对数变换、缩放和归一化的方法,这种方法使 ML 方法能够对太空飞行与地面对照动物进行分类(AUC ≥ 0.87),同时减轻了来自飞行任务的变异性。通过基于 ML 的统一分析和逐项研究的经典 omics 分析确定的肝脏特异性生物过程之间存在一致性。这项工作证明了在小样本量的综合异构数据集上应用 ML 方法的可行性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Harmonizing heterogeneous transcriptomics datasets for machine learning-based analysis to identify spaceflown murine liver-specific changes.

NASA has employed high-throughput molecular assays to identify sub-cellular changes impacting human physiology during spaceflight. Machine learning (ML) methods hold the promise to improve our ability to identify important signals within highly dimensional molecular data. However, the inherent limitation of study subject numbers within a spaceflight mission minimizes the utility of ML approaches. To overcome the sample power limitations, data from multiple spaceflight missions must be aggregated while appropriately addressing intra- and inter-study variabilities. Here we describe an approach to log transform, scale and normalize data from six heterogeneous, mouse liver-derived transcriptomics datasets (ntotal = 137) which enabled ML-methods to classify spaceflown vs. ground control animals (AUC ≥ 0.87) while mitigating the variability from mission-of-origin. Concordance was found between liver-specific biological processes identified from harmonized ML-based analysis and study-by-study classical omics analysis. This work demonstrates the feasibility of applying ML methods on integrated, heterogeneous datasets of small sample size.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
npj Microgravity
npj Microgravity Physics and Astronomy-Physics and Astronomy (miscellaneous)
CiteScore
7.30
自引率
7.80%
发文量
50
审稿时长
9 weeks
期刊介绍: A new open access, online-only, multidisciplinary research journal, npj Microgravity is dedicated to publishing the most important scientific advances in the life sciences, physical sciences, and engineering fields that are facilitated by spaceflight and analogue platforms.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信