{"title":"SimBPDD:模拟Beta-Poisson模型中的差异分布,特别是单细胞RNA测序数据","authors":"Roman Schefzik","doi":"10.33039/AMI.2021.03.003","DOIUrl":null,"url":null,"abstract":"Beta-Poisson (BP) models employ Poisson distributions, where the corresponding rate parameter itself is a Beta-distributed random variable. They have been shown to appropriately mimic gene expression distributions in the context of single-cell ribonucleic acid sequencing (scRNA-seq), a breakthrough technology allowing to sequence information from individual biological cells and facilitating fundamental insights into numerous fields of biology. A prominent scRNA-seq data analysis task is to identify differences in gene expression distributions across two conditions. To validate new statistical approaches in this context, one typically has to rely on accurate simulations, as usually no ground truth for an assessment is available. We introduce several simulation procedures that allow to generate differential distributions (DDs) based on BP models. In particular, we describe how to create different types of DDs, mirroring various sources or origins of a difference, and different degrees of DDs, from a weak to a strong difference. The soundness of the simulation procedures is shown in a validation study in which theoretically expected model properties of the DD simulations are confirmed. The findings are in principle not restricted to the scRNA-seq context and may be generally applicable also to other application areas. The simulation approaches are implemented in the publicly available R package SimBPDD.","PeriodicalId":8040,"journal":{"name":"Applied Medical Informaticvs","volume":"3 1","pages":"283-298"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"SimBPDD: Simulating differential distributions in Beta-Poisson models, in particular for single-cell RNA sequencing data\",\"authors\":\"Roman Schefzik\",\"doi\":\"10.33039/AMI.2021.03.003\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Beta-Poisson (BP) models employ Poisson distributions, where the corresponding rate parameter itself is a Beta-distributed random variable. They have been shown to appropriately mimic gene expression distributions in the context of single-cell ribonucleic acid sequencing (scRNA-seq), a breakthrough technology allowing to sequence information from individual biological cells and facilitating fundamental insights into numerous fields of biology. A prominent scRNA-seq data analysis task is to identify differences in gene expression distributions across two conditions. To validate new statistical approaches in this context, one typically has to rely on accurate simulations, as usually no ground truth for an assessment is available. We introduce several simulation procedures that allow to generate differential distributions (DDs) based on BP models. In particular, we describe how to create different types of DDs, mirroring various sources or origins of a difference, and different degrees of DDs, from a weak to a strong difference. The soundness of the simulation procedures is shown in a validation study in which theoretically expected model properties of the DD simulations are confirmed. The findings are in principle not restricted to the scRNA-seq context and may be generally applicable also to other application areas. The simulation approaches are implemented in the publicly available R package SimBPDD.\",\"PeriodicalId\":8040,\"journal\":{\"name\":\"Applied Medical Informaticvs\",\"volume\":\"3 1\",\"pages\":\"283-298\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Medical Informaticvs\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.33039/AMI.2021.03.003\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Medical Informaticvs","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33039/AMI.2021.03.003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
SimBPDD: Simulating differential distributions in Beta-Poisson models, in particular for single-cell RNA sequencing data
Beta-Poisson (BP) models employ Poisson distributions, where the corresponding rate parameter itself is a Beta-distributed random variable. They have been shown to appropriately mimic gene expression distributions in the context of single-cell ribonucleic acid sequencing (scRNA-seq), a breakthrough technology allowing to sequence information from individual biological cells and facilitating fundamental insights into numerous fields of biology. A prominent scRNA-seq data analysis task is to identify differences in gene expression distributions across two conditions. To validate new statistical approaches in this context, one typically has to rely on accurate simulations, as usually no ground truth for an assessment is available. We introduce several simulation procedures that allow to generate differential distributions (DDs) based on BP models. In particular, we describe how to create different types of DDs, mirroring various sources or origins of a difference, and different degrees of DDs, from a weak to a strong difference. The soundness of the simulation procedures is shown in a validation study in which theoretically expected model properties of the DD simulations are confirmed. The findings are in principle not restricted to the scRNA-seq context and may be generally applicable also to other application areas. The simulation approaches are implemented in the publicly available R package SimBPDD.