Inductive Logic Programming for Structure-Activity Relationship Studies on Large Scale Data

C. Nattee, S. Sinthupinyo, M. Numao, T. Okada
{"title":"Inductive Logic Programming for Structure-Activity Relationship Studies on Large Scale Data","authors":"C. Nattee, S. Sinthupinyo, M. Numao, T. Okada","doi":"10.1109/SAINTW.2005.68","DOIUrl":null,"url":null,"abstract":"Inductive Logic Programming (ILP) is a combination of inductive learning and first-order logic aiming to learn first-order hypotheses from training examples. ILP has a serious bottleneck in an intractably enormous hypothesis search space. Thismakes existing approaches perform poorly on large-scale real-world datasets. In this research, we propose a technique to make the system handle an enormous search space efficiently by deriving qualitative information into search heuristics. Currently, heuristic functions used in ILP systems are based only on quantitative information, e.g. number of examples covered and length of candidates. We focus on a kind of data consisting of several parts. The approach aims to find hypotheses describing each class by using both individual and relational features of parts. The data can be found in denoting chemical compound structure for Structure-Activity Relationship studies (SAR). We apply the proposed method to extract rules describing chemical activity from their structures. The experiments are conducted on a real-world dataset. The results are compared to existing ILP methods using ten-fold cross validation.","PeriodicalId":220913,"journal":{"name":"2005 Symposium on Applications and the Internet Workshops (SAINT 2005 Workshops)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 Symposium on Applications and the Internet Workshops (SAINT 2005 Workshops)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SAINTW.2005.68","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Inductive Logic Programming (ILP) is a combination of inductive learning and first-order logic aiming to learn first-order hypotheses from training examples. ILP has a serious bottleneck in an intractably enormous hypothesis search space. Thismakes existing approaches perform poorly on large-scale real-world datasets. In this research, we propose a technique to make the system handle an enormous search space efficiently by deriving qualitative information into search heuristics. Currently, heuristic functions used in ILP systems are based only on quantitative information, e.g. number of examples covered and length of candidates. We focus on a kind of data consisting of several parts. The approach aims to find hypotheses describing each class by using both individual and relational features of parts. The data can be found in denoting chemical compound structure for Structure-Activity Relationship studies (SAR). We apply the proposed method to extract rules describing chemical activity from their structures. The experiments are conducted on a real-world dataset. The results are compared to existing ILP methods using ten-fold cross validation.
大规模数据结构-活动关系研究的归纳逻辑规划
归纳逻辑规划(ILP)是归纳学习和一阶逻辑的结合,旨在从训练样例中学习一阶假设。在一个难以处理的巨大假设搜索空间中,ILP有一个严重的瓶颈。这使得现有的方法在大规模真实数据集上表现不佳。在本研究中,我们提出了一种技术,通过将定性信息转化为搜索启发式,使系统有效地处理巨大的搜索空间。目前,在ILP系统中使用的启发式函数仅基于定量信息,例如涵盖的示例数量和候选长度。我们关注的是一种由几个部分组成的数据。该方法旨在通过使用部件的个体特征和关系特征来找到描述每个类的假设。这些数据可以在构效关系研究(SAR)中表示化合物结构。我们应用该方法从它们的结构中提取描述化学活性的规则。实验是在一个真实的数据集上进行的。使用十倍交叉验证将结果与现有的ILP方法进行比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信