{"title":"图相似度搜索的研究","authors":"Haichuan Shang, M. Kitsuregawa","doi":"10.11185/IMT.7.1565","DOIUrl":null,"url":null,"abstract":"Graph similarity search is to retrieve graphs that approximately contain a given query graph. It has many applications, e.g., detecting similar functions among chemical compounds. The problem is challenging as even testing subgraph containment between two graphs is NP-complete. Hence, existing techniques adopt the filtering-and-verification framework with the focus on developing effective and efficient techniques to remove non-promising graphs. Nevertheless, existing filtering techniques may be still unable to effectively remove many ”low” quality candidates. To resolve this, in this paper we propose a novel indexing technique to index graphs according to their ”distances” to features. We then develop lower and upper bounding techniques that exploit the index to (1) prune non-promising graphs and (2) include graphs whose similarities are guaranteed to exceed the given similarity threshold. Considering that the verification phase is not well studied and plays the dominant role in the whole process, we devise efficient algorithms to verify candidates. A comprehensive experiment using real datasets demonstrates that our proposed methods significantly outperform existing methods.","PeriodicalId":16243,"journal":{"name":"Journal of Information Processing","volume":"7 1","pages":"1565-1570"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Study on Graph Similarity Search\",\"authors\":\"Haichuan Shang, M. Kitsuregawa\",\"doi\":\"10.11185/IMT.7.1565\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph similarity search is to retrieve graphs that approximately contain a given query graph. It has many applications, e.g., detecting similar functions among chemical compounds. The problem is challenging as even testing subgraph containment between two graphs is NP-complete. Hence, existing techniques adopt the filtering-and-verification framework with the focus on developing effective and efficient techniques to remove non-promising graphs. Nevertheless, existing filtering techniques may be still unable to effectively remove many ”low” quality candidates. To resolve this, in this paper we propose a novel indexing technique to index graphs according to their ”distances” to features. We then develop lower and upper bounding techniques that exploit the index to (1) prune non-promising graphs and (2) include graphs whose similarities are guaranteed to exceed the given similarity threshold. Considering that the verification phase is not well studied and plays the dominant role in the whole process, we devise efficient algorithms to verify candidates. A comprehensive experiment using real datasets demonstrates that our proposed methods significantly outperform existing methods.\",\"PeriodicalId\":16243,\"journal\":{\"name\":\"Journal of Information Processing\",\"volume\":\"7 1\",\"pages\":\"1565-1570\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-07-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Information Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.11185/IMT.7.1565\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Computer Science\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Information Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.11185/IMT.7.1565","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Computer Science","Score":null,"Total":0}
Graph similarity search is to retrieve graphs that approximately contain a given query graph. It has many applications, e.g., detecting similar functions among chemical compounds. The problem is challenging as even testing subgraph containment between two graphs is NP-complete. Hence, existing techniques adopt the filtering-and-verification framework with the focus on developing effective and efficient techniques to remove non-promising graphs. Nevertheless, existing filtering techniques may be still unable to effectively remove many ”low” quality candidates. To resolve this, in this paper we propose a novel indexing technique to index graphs according to their ”distances” to features. We then develop lower and upper bounding techniques that exploit the index to (1) prune non-promising graphs and (2) include graphs whose similarities are guaranteed to exceed the given similarity threshold. Considering that the verification phase is not well studied and plays the dominant role in the whole process, we devise efficient algorithms to verify candidates. A comprehensive experiment using real datasets demonstrates that our proposed methods significantly outperform existing methods.