归纳逻辑程序设计在微阵列数据中的应用方法

Hiromu Ide, M. Umezawa, H. Ohwada
{"title":"归纳逻辑程序设计在微阵列数据中的应用方法","authors":"Hiromu Ide, M. Umezawa, H. Ohwada","doi":"10.1145/3156346.3156356","DOIUrl":null,"url":null,"abstract":"This paper describing a method of specifying common terms of genes from microarray data in 3 steps. First, we use random forest for extracting disease-related genes and it give each gene variable importance. The higher the variable importance, the more effective feature for classification. We extract genes whose variable importance more than 0 and set them positive samples and the rest set negative samples for ILP. Next, we annotate extracted genes by using Gene Ontology (GO) and use the term as predicate for ILP. Annotation is the process of assigning GO terms to gene products. Finally, we obtain rules about common terms in positive samples by using ILP. ILP is a subfield of machine learning which uses logic programming as a uniform representation technique for examples, background knowledge and hypotheses. ILP learns based on background knowledge. Background knowledge is represented in first-order logic. In the result, we extracted 1051 mRNA as positive samples for ILP from random forest and its F-measure score was 65.1%. We obtained about 4000 terms at each dataset and use them as predicates for ILP. We got eventually some rules about positive samples.","PeriodicalId":415207,"journal":{"name":"Proceedings of the 8th International Conference on Computational Systems-Biology and Bioinformatics","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Proposal of application method of Inductive Logic Programming to microarray data\",\"authors\":\"Hiromu Ide, M. Umezawa, H. Ohwada\",\"doi\":\"10.1145/3156346.3156356\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper describing a method of specifying common terms of genes from microarray data in 3 steps. First, we use random forest for extracting disease-related genes and it give each gene variable importance. The higher the variable importance, the more effective feature for classification. We extract genes whose variable importance more than 0 and set them positive samples and the rest set negative samples for ILP. Next, we annotate extracted genes by using Gene Ontology (GO) and use the term as predicate for ILP. Annotation is the process of assigning GO terms to gene products. Finally, we obtain rules about common terms in positive samples by using ILP. ILP is a subfield of machine learning which uses logic programming as a uniform representation technique for examples, background knowledge and hypotheses. ILP learns based on background knowledge. Background knowledge is represented in first-order logic. In the result, we extracted 1051 mRNA as positive samples for ILP from random forest and its F-measure score was 65.1%. We obtained about 4000 terms at each dataset and use them as predicates for ILP. We got eventually some rules about positive samples.\",\"PeriodicalId\":415207,\"journal\":{\"name\":\"Proceedings of the 8th International Conference on Computational Systems-Biology and Bioinformatics\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 8th International Conference on Computational Systems-Biology and Bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3156346.3156356\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 8th International Conference on Computational Systems-Biology and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3156346.3156356","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

本文描述了一种从微阵列数据中分三步确定基因共同术语的方法。首先,我们使用随机森林提取疾病相关基因,并赋予每个基因变量的重要性。变量重要度越高,特征对分类越有效。我们提取变量重要度大于0的基因作为ILP的阳性样本,其余的作为阴性样本。接下来,我们使用基因本体(Gene Ontology, GO)对提取的基因进行注释,并使用该术语作为ILP的谓词。注释是将GO术语分配给基因产物的过程。最后,我们利用ILP方法得到了阳性样本中公共项的规则。ILP是机器学习的一个子领域,它使用逻辑编程作为示例、背景知识和假设的统一表示技术。ILP基于背景知识进行学习。背景知识用一阶逻辑表示。结果,我们从随机森林中提取了1051个mRNA作为ILP阳性样本,其F-measure得分为65.1%。我们在每个数据集中获得了大约4000个术语,并将它们用作ILP的谓词。我们最终得到了一些关于阳性样本的规则。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Proposal of application method of Inductive Logic Programming to microarray data
This paper describing a method of specifying common terms of genes from microarray data in 3 steps. First, we use random forest for extracting disease-related genes and it give each gene variable importance. The higher the variable importance, the more effective feature for classification. We extract genes whose variable importance more than 0 and set them positive samples and the rest set negative samples for ILP. Next, we annotate extracted genes by using Gene Ontology (GO) and use the term as predicate for ILP. Annotation is the process of assigning GO terms to gene products. Finally, we obtain rules about common terms in positive samples by using ILP. ILP is a subfield of machine learning which uses logic programming as a uniform representation technique for examples, background knowledge and hypotheses. ILP learns based on background knowledge. Background knowledge is represented in first-order logic. In the result, we extracted 1051 mRNA as positive samples for ILP from random forest and its F-measure score was 65.1%. We obtained about 4000 terms at each dataset and use them as predicates for ILP. We got eventually some rules about positive samples.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信