{"title":"伪时间分析推测公共RNA-Seq队列中缺失的肝脏NAFLD状态","authors":"Tongyang Wang, Xiangmei Dou","doi":"10.1145/3543081.3543094","DOIUrl":null,"url":null,"abstract":"Existing gene expression analysis methods like microarray or RNA-sequencing are unable to resolve the complex mechanisms of progression of non-alcoholic fatty liver disease (NAFLD) due to insufficient accuracy and lack of phenotypic data. Particularly, incomplete phenotypic data in public liver gene expression cohorts have cumbered many studies on the progression of NAFLD. To address this issue, the cutting-edge pseudotime analysis is adopted to estimate liver health status in human liver gene expression data. A set of 25 genes differentially expressed between the healthy controls and the NAFLD group samples are identified by differential expression (DE) Analysis. The identified DE genes separate the NAFLD patients and the healthy controls in hierarchical clustering, and their related biological pathways are highly relevant to liver signaling and injury, implying the close relationship between the DE gene expressions and NAFLD. What's more, the pseudotime analysis we conducted simulates the deterioration of NAFLD by using liver fat percent to represent NAFLD severity and aligning the candidate samples on the estimated trajectory according to their respective gene expression and covariates; we verified the pseudotime model using another microarray cohort. The verified pseudotime model is further applied to an RNA-Seq cohort (GTEx) to estimate the liver health status of samples that lacked phenotypic details. This model recurs the timeline of NAFLD progression and verifies the potential key roles of the expression of DE genes in this process. In conclusion, the expressions of the genes and their changes in distinct groups of samples are chronologically consistent with the progression of NAFLD severity. The pseudotime model can be used to impute the missing NAFLD phenotypes in public liver gene expression cohorts.","PeriodicalId":432056,"journal":{"name":"Proceedings of the 6th International Conference on Biomedical Engineering and Applications","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Pseudotime Analysis Imputes the Missing Liver NAFLD Status in Public RNA-Seq Cohorts\",\"authors\":\"Tongyang Wang, Xiangmei Dou\",\"doi\":\"10.1145/3543081.3543094\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Existing gene expression analysis methods like microarray or RNA-sequencing are unable to resolve the complex mechanisms of progression of non-alcoholic fatty liver disease (NAFLD) due to insufficient accuracy and lack of phenotypic data. Particularly, incomplete phenotypic data in public liver gene expression cohorts have cumbered many studies on the progression of NAFLD. To address this issue, the cutting-edge pseudotime analysis is adopted to estimate liver health status in human liver gene expression data. A set of 25 genes differentially expressed between the healthy controls and the NAFLD group samples are identified by differential expression (DE) Analysis. The identified DE genes separate the NAFLD patients and the healthy controls in hierarchical clustering, and their related biological pathways are highly relevant to liver signaling and injury, implying the close relationship between the DE gene expressions and NAFLD. What's more, the pseudotime analysis we conducted simulates the deterioration of NAFLD by using liver fat percent to represent NAFLD severity and aligning the candidate samples on the estimated trajectory according to their respective gene expression and covariates; we verified the pseudotime model using another microarray cohort. The verified pseudotime model is further applied to an RNA-Seq cohort (GTEx) to estimate the liver health status of samples that lacked phenotypic details. This model recurs the timeline of NAFLD progression and verifies the potential key roles of the expression of DE genes in this process. In conclusion, the expressions of the genes and their changes in distinct groups of samples are chronologically consistent with the progression of NAFLD severity. The pseudotime model can be used to impute the missing NAFLD phenotypes in public liver gene expression cohorts.\",\"PeriodicalId\":432056,\"journal\":{\"name\":\"Proceedings of the 6th International Conference on Biomedical Engineering and Applications\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 6th International Conference on Biomedical Engineering and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3543081.3543094\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 6th International Conference on Biomedical Engineering and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3543081.3543094","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Pseudotime Analysis Imputes the Missing Liver NAFLD Status in Public RNA-Seq Cohorts
Existing gene expression analysis methods like microarray or RNA-sequencing are unable to resolve the complex mechanisms of progression of non-alcoholic fatty liver disease (NAFLD) due to insufficient accuracy and lack of phenotypic data. Particularly, incomplete phenotypic data in public liver gene expression cohorts have cumbered many studies on the progression of NAFLD. To address this issue, the cutting-edge pseudotime analysis is adopted to estimate liver health status in human liver gene expression data. A set of 25 genes differentially expressed between the healthy controls and the NAFLD group samples are identified by differential expression (DE) Analysis. The identified DE genes separate the NAFLD patients and the healthy controls in hierarchical clustering, and their related biological pathways are highly relevant to liver signaling and injury, implying the close relationship between the DE gene expressions and NAFLD. What's more, the pseudotime analysis we conducted simulates the deterioration of NAFLD by using liver fat percent to represent NAFLD severity and aligning the candidate samples on the estimated trajectory according to their respective gene expression and covariates; we verified the pseudotime model using another microarray cohort. The verified pseudotime model is further applied to an RNA-Seq cohort (GTEx) to estimate the liver health status of samples that lacked phenotypic details. This model recurs the timeline of NAFLD progression and verifies the potential key roles of the expression of DE genes in this process. In conclusion, the expressions of the genes and their changes in distinct groups of samples are chronologically consistent with the progression of NAFLD severity. The pseudotime model can be used to impute the missing NAFLD phenotypes in public liver gene expression cohorts.