利用自动挖掘的结构模式进行化合物分类。

Proceedings of the ... Asia-Pacific bioinformatics conference Pub Date : 2008-01-01 DOI:10.1901/jaba.2008.6-39

A M Smalter, J Huan, G H Lushington

{"title":"利用自动挖掘的结构模式进行化合物分类。","authors":"A M Smalter, J Huan, G H Lushington","doi":"10.1901/jaba.2008.6-39","DOIUrl":null,"url":null,"abstract":"In this paper we propose new methods of chemical structure classification based on the integration of graph database mining from data mining and graph kernel functions from machine learning. In our method, we first identify a set of general graph patterns in chemical structure data. These patterns are then used to augment a graph kernel function that calculates the pairwise similarity between molecules. The obtained similarity matrix is used as input to classify chemical compounds via a kernel machines such as the support vector machine (SVM). Our results indicate that the use of a pattern-based approach to graph similarity yields performance profiles comparable to, and sometimes exceeding that of the existing state-of-the-art approaches. In addition, the identification of highly discriminative patterns for activity classification provides evidence that our methods can make generalizations about a compound's function given its chemical structure. While we evaluated our methods on molecular structures, these methods are designed to operate on general graph data and hence could easily be applied to other domains in bioinformatics.","PeriodicalId":74513,"journal":{"name":"Proceedings of the ... Asia-Pacific bioinformatics conference","volume":"6 ","pages":"39-48"},"PeriodicalIF":0.0000,"publicationDate":"2008-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2864492/pdf/nihms118197.pdf","citationCount":"0","resultStr":"{\"title\":\"CHEMICAL COMPOUND CLASSIFICATION WITH AUTOMATICALLY MINED STRUCTURE PATTERNS.\",\"authors\":\"A M Smalter, J Huan, G H Lushington\",\"doi\":\"10.1901/jaba.2008.6-39\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we propose new methods of chemical structure classification based on the integration of graph database mining from data mining and graph kernel functions from machine learning. In our method, we first identify a set of general graph patterns in chemical structure data. These patterns are then used to augment a graph kernel function that calculates the pairwise similarity between molecules. The obtained similarity matrix is used as input to classify chemical compounds via a kernel machines such as the support vector machine (SVM). Our results indicate that the use of a pattern-based approach to graph similarity yields performance profiles comparable to, and sometimes exceeding that of the existing state-of-the-art approaches. In addition, the identification of highly discriminative patterns for activity classification provides evidence that our methods can make generalizations about a compound's function given its chemical structure. While we evaluated our methods on molecular structures, these methods are designed to operate on general graph data and hence could easily be applied to other domains in bioinformatics.\",\"PeriodicalId\":74513,\"journal\":{\"name\":\"Proceedings of the ... Asia-Pacific bioinformatics conference\",\"volume\":\"6 \",\"pages\":\"39-48\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2864492/pdf/nihms118197.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the ... Asia-Pacific bioinformatics conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1901/jaba.2008.6-39\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the ... Asia-Pacific bioinformatics conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1901/jaba.2008.6-39","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文基于数据挖掘中的图数据库挖掘和机器学习中的图核函数的整合，提出了化学结构分类的新方法。在我们的方法中，我们首先从化学结构数据中识别出一组通用图模式。然后利用这些模式来增强计算分子间成对相似性的图核函数。得到的相似性矩阵作为输入，通过支持向量机（SVM）等核机器对化合物进行分类。我们的研究结果表明，使用基于模式的图形相似性方法所产生的性能曲线可与现有的最先进方法相媲美，有时甚至超过它们。此外，对活性分类的高区分度模式的识别证明，我们的方法可以根据化合物的化学结构对其功能进行归纳。虽然我们是在分子结构上对我们的方法进行评估的，但这些方法是为在一般图数据上运行而设计的，因此很容易应用于生物信息学的其他领域。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

CHEMICAL COMPOUND CLASSIFICATION WITH AUTOMATICALLY MINED STRUCTURE PATTERNS.

In this paper we propose new methods of chemical structure classification based on the integration of graph database mining from data mining and graph kernel functions from machine learning. In our method, we first identify a set of general graph patterns in chemical structure data. These patterns are then used to augment a graph kernel function that calculates the pairwise similarity between molecules. The obtained similarity matrix is used as input to classify chemical compounds via a kernel machines such as the support vector machine (SVM). Our results indicate that the use of a pattern-based approach to graph similarity yields performance profiles comparable to, and sometimes exceeding that of the existing state-of-the-art approaches. In addition, the identification of highly discriminative patterns for activity classification provides evidence that our methods can make generalizations about a compound's function given its chemical structure. While we evaluated our methods on molecular structures, these methods are designed to operate on general graph data and hence could easily be applied to other domains in bioinformatics.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the ... Asia-Pacific bioinformatics conference

自引率

0.00%

发文量