Marisa Thoma, Hong Cheng, A. Gretton, Jiawei Han, H. Kriegel, Alex Smola, Le Song, Philip S. Yu, Xifeng Yan, Karsten M. Borgwardt
求助PDF
{"title":"具有最优性保证的判别频繁子图挖掘","authors":"Marisa Thoma, Hong Cheng, A. Gretton, Jiawei Han, H. Kriegel, Alex Smola, Le Song, Philip S. Yu, Xifeng Yan, Karsten M. Borgwardt","doi":"10.1002/SAM.V3:5","DOIUrl":null,"url":null,"abstract":"The goal of frequent subgraph mining is to detect subgraphs that frequently occur in a dataset of graphs. In classification settings, one is often interested in discovering discriminative frequent subgraphs, whose presence or absence is indicative of the class membership of a graph. In this article, we propose an approach to feature selection on frequent subgraphs, called CORK, that combines two central advantages. First, it optimizes a submodular quality criterion, which means that we can yield a near-optimal solution using greedy feature selection. Second, our submodular quality function criterion can be integrated into gSpan, the state-of-the-art tool for frequent subgraph mining, and help to prune the search space for discriminative frequent subgraphs even during frequent subgraph mining. Copyright © 2010 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 3: 302-318, 2010","PeriodicalId":48684,"journal":{"name":"Statistical Analysis and Data Mining","volume":"3 1","pages":"302-318"},"PeriodicalIF":2.1000,"publicationDate":"2010-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"Discriminative frequent subgraph mining with optimality guarantees\",\"authors\":\"Marisa Thoma, Hong Cheng, A. Gretton, Jiawei Han, H. Kriegel, Alex Smola, Le Song, Philip S. Yu, Xifeng Yan, Karsten M. Borgwardt\",\"doi\":\"10.1002/SAM.V3:5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The goal of frequent subgraph mining is to detect subgraphs that frequently occur in a dataset of graphs. In classification settings, one is often interested in discovering discriminative frequent subgraphs, whose presence or absence is indicative of the class membership of a graph. In this article, we propose an approach to feature selection on frequent subgraphs, called CORK, that combines two central advantages. First, it optimizes a submodular quality criterion, which means that we can yield a near-optimal solution using greedy feature selection. Second, our submodular quality function criterion can be integrated into gSpan, the state-of-the-art tool for frequent subgraph mining, and help to prune the search space for discriminative frequent subgraphs even during frequent subgraph mining. Copyright © 2010 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 3: 302-318, 2010\",\"PeriodicalId\":48684,\"journal\":{\"name\":\"Statistical Analysis and Data Mining\",\"volume\":\"3 1\",\"pages\":\"302-318\"},\"PeriodicalIF\":2.1000,\"publicationDate\":\"2010-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Statistical Analysis and Data Mining\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1002/SAM.V3:5\",\"RegionNum\":4,\"RegionCategory\":\"数学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Analysis and Data Mining","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1002/SAM.V3:5","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 23
引用
批量引用
Discriminative frequent subgraph mining with optimality guarantees
The goal of frequent subgraph mining is to detect subgraphs that frequently occur in a dataset of graphs. In classification settings, one is often interested in discovering discriminative frequent subgraphs, whose presence or absence is indicative of the class membership of a graph. In this article, we propose an approach to feature selection on frequent subgraphs, called CORK, that combines two central advantages. First, it optimizes a submodular quality criterion, which means that we can yield a near-optimal solution using greedy feature selection. Second, our submodular quality function criterion can be integrated into gSpan, the state-of-the-art tool for frequent subgraph mining, and help to prune the search space for discriminative frequent subgraphs even during frequent subgraph mining. Copyright © 2010 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 3: 302-318, 2010