{"title":"Predicting potential gene ontology from cellular response data","authors":"Hao Hong, Xiaoyao Yin, Fei Li, Naiyang Guan, Xiaochen Bo, Zhigang Luo","doi":"10.1145/3035012.3035015","DOIUrl":null,"url":null,"abstract":"Ontologies have proven to be useful for capturing and organizing knowledge as a hierarchical set of terms and their relationships. However, curating gene ontology data by hand requires specialized knowledge of certain field, which is inefficient. Thus inferring gene ontology from the exponentially increased biological data is getting hot. Based on the Library of Integrated Network-Based Cellular Signatures (LINCS) data we came up with the hypothesis that genes participate in analogous biological processes might affect cells accordantly. By assessing cellular response after genes were knock out we built a similarity matrix with the Gene Set Enrichment Analysis (GSEA) and clustered the genes with affinity propagation algorithm. Next we mapped the cluster result to gene ontology biological process data for annotation and enrichment analysis, which confirmed our hypothesis and made it possible to predict biological processes for unannotated genes from cellular response data after genes are knock out for the first time. We further validated the rationality from the gene ontology molecular function data.1","PeriodicalId":130142,"journal":{"name":"Proceedings of the 5th International Conference on Bioinformatics and Computational Biology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Conference on Bioinformatics and Computational Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3035012.3035015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Ontologies have proven to be useful for capturing and organizing knowledge as a hierarchical set of terms and their relationships. However, curating gene ontology data by hand requires specialized knowledge of certain field, which is inefficient. Thus inferring gene ontology from the exponentially increased biological data is getting hot. Based on the Library of Integrated Network-Based Cellular Signatures (LINCS) data we came up with the hypothesis that genes participate in analogous biological processes might affect cells accordantly. By assessing cellular response after genes were knock out we built a similarity matrix with the Gene Set Enrichment Analysis (GSEA) and clustered the genes with affinity propagation algorithm. Next we mapped the cluster result to gene ontology biological process data for annotation and enrichment analysis, which confirmed our hypothesis and made it possible to predict biological processes for unannotated genes from cellular response data after genes are knock out for the first time. We further validated the rationality from the gene ontology molecular function data.1
事实证明,本体对于捕获和组织知识非常有用,可以将其作为一组分层的术语及其关系。然而,手工整理基因本体数据需要特定领域的专业知识,效率低下。因此,从呈指数增长的生物数据中推断基因本体论变得越来越热门。基于集成网络细胞特征库(LINCS)的数据,我们提出了基因参与类似的生物过程可能相应影响细胞的假设。通过评估基因敲除后的细胞反应,利用基因集富集分析(Gene Set Enrichment Analysis, GSEA)构建了相似矩阵,并用亲和繁殖算法对基因进行聚类。接下来,我们将聚类结果映射到基因本体生物过程数据中进行注释和富集分析,这证实了我们的假设,并且首次从基因敲除后的细胞反应数据中预测未注释基因的生物过程。从基因本体分子功能数据进一步验证其合理性