{"title":"用于微生物组数据分析的预训练引导条件扩散模型","authors":"Xinyuan Shi, Fangfang Zhu, Wenwen Min","doi":"arxiv-2408.07709","DOIUrl":null,"url":null,"abstract":"Emerging evidence indicates that human cancers are intricately linked to\nhuman microbiomes, forming an inseparable connection. However, due to limited\nsample sizes and significant data loss during collection for various reasons,\nsome machine learning methods have been proposed to address the issue of\nmissing data. These methods have not fully utilized the known clinical\ninformation of patients to enhance the accuracy of data imputation. Therefore,\nwe introduce mbVDiT, a novel pre-trained conditional diffusion model for\nmicrobiome data imputation and denoising, which uses the unmasked data and\npatient metadata as conditional guidance for imputating missing values. It is\nalso uses VAE to integrate the the other public microbiome datasets to enhance\nmodel performance. The results on the microbiome datasets from three different\ncancer types demonstrate the performance of our methods in comparison with\nexisting methods.","PeriodicalId":501070,"journal":{"name":"arXiv - QuanBio - Genomics","volume":"104 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Pretrained-Guided Conditional Diffusion Models for Microbiome Data Analysis\",\"authors\":\"Xinyuan Shi, Fangfang Zhu, Wenwen Min\",\"doi\":\"arxiv-2408.07709\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Emerging evidence indicates that human cancers are intricately linked to\\nhuman microbiomes, forming an inseparable connection. However, due to limited\\nsample sizes and significant data loss during collection for various reasons,\\nsome machine learning methods have been proposed to address the issue of\\nmissing data. These methods have not fully utilized the known clinical\\ninformation of patients to enhance the accuracy of data imputation. Therefore,\\nwe introduce mbVDiT, a novel pre-trained conditional diffusion model for\\nmicrobiome data imputation and denoising, which uses the unmasked data and\\npatient metadata as conditional guidance for imputating missing values. It is\\nalso uses VAE to integrate the the other public microbiome datasets to enhance\\nmodel performance. The results on the microbiome datasets from three different\\ncancer types demonstrate the performance of our methods in comparison with\\nexisting methods.\",\"PeriodicalId\":501070,\"journal\":{\"name\":\"arXiv - QuanBio - Genomics\",\"volume\":\"104 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - QuanBio - Genomics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.07709\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - QuanBio - Genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.07709","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Pretrained-Guided Conditional Diffusion Models for Microbiome Data Analysis
Emerging evidence indicates that human cancers are intricately linked to
human microbiomes, forming an inseparable connection. However, due to limited
sample sizes and significant data loss during collection for various reasons,
some machine learning methods have been proposed to address the issue of
missing data. These methods have not fully utilized the known clinical
information of patients to enhance the accuracy of data imputation. Therefore,
we introduce mbVDiT, a novel pre-trained conditional diffusion model for
microbiome data imputation and denoising, which uses the unmasked data and
patient metadata as conditional guidance for imputating missing values. It is
also uses VAE to integrate the the other public microbiome datasets to enhance
model performance. The results on the microbiome datasets from three different
cancer types demonstrate the performance of our methods in comparison with
existing methods.