Anytime and Distributed Approaches for Graph Matching

Q4 Computer Science

Electronic Letters on Computer Vision and Image Analysis Pub Date : 2016-11-04 DOI:10.5565/REV/ELCVIA.986

Zeina Abu-Aisheh

{"title":"Anytime and Distributed Approaches for Graph Matching","authors":"Zeina Abu-Aisheh","doi":"10.5565/REV/ELCVIA.986","DOIUrl":null,"url":null,"abstract":"Due to the inherent genericity of graph-based representations, and thanks to the improvement of computer capacities, structural representations have become more and more popular in the field of Pattern Recognition (PR). In a graph-based representation, vertices and their attributes describe objects (or part of them) while edges represent interrelationships between the objects. Representing objects by graphs turns the problem of object comparison into graph matching (GM) where correspondences between vertices and edges of two graphs have to be found. In the domain of GM, over the last decade, Graph Edit Distance (GED) has been given a specific attention due to its flexibility to match many types of graphs. GED has been applied to a wide range of specific applications from molecule recognition to image classification. Researchers have shed light on the approximate methods that can find suboptimal solutions hopefully close to the optimal ones but the gap between optimal and suboptimal solutions has not been deeply studied yet. For that reason, in this thesis, we focus on exact GED algorithms. Unfortunately, exact GED methods have an exponential complexity. Thus, coming up with an exact GED algorithm that can be scaled up to match graphs involved in PR tasks is a great challenge. Two promising ways to cut-off computational time are search space pruning and distributed algorithms. To this end, we first propose a depth-first GED algorithm which requires less memory and search time. An evaluation of all possible solutions is performed without explicitly enumerating all of them. Candidates are discarded using an upper and lower bounds strategy. To find a trade-off between speed and optimality, we describe how to convert the proposed depth-first GED method into an anytime one that is capable of delivering a first solution very quickly. It also can find a list of improved solutions and eventually converges to the optimal solution instead of providing one and only one solution (i.e., the optimal solution). With the delight of more time, anytime methods can also reach the optimal solution. To illustrate the usage of anytime GM algorithms, we convert our depth-first GED algorithm into an anytime one. We analyze the properties of such methods to solve GM problems and consider the performance in terms of accuracy of the provided solution compared to the optimal or the best one found by a state-of-the-art methods. This thesis is also considered as a first attempt to reduce the run time of exact GED methods using parallel and distributed fashions. Two parallel and distributed GED approaches are put forward; both of them are based on the depth-first GED method. The search space is decomposed into smaller search trees which are solved independently in a parallel or a distributed manner. To benchmark the proposed GED methods, we propose not only assessing GED methods in a classification context but also evaluating them in a graph-level one (i.e., evaluating their distance and matchin accuracy). Due to the exponential complexity of exact GED algorithms and in order to obtain this kind of information about methods, we propose analyzing the behavior of the eight compared methods under time and memory constraints. In addition to the performance evaluations metrics, we propose a graph database repository dedicated to GED. In this repository, we add graph-level information to well-known and publicly used databases. Added information consists of the best found edit distance of each pair of graphs as well as their vertex-to-vertex and edge-to-edge mappings corresponding to the best found distance. This information helps in assessing the feasibility of exact and approximate GED methods. This thesis brings into question the usual evidences saying that it is impossible to use exact errortolerant GM methods in real-world applications when matching large graphs, or even in a classification context. However, we argue and show that a new type of GM, referred to as anytime methods, can be successful in a graph-level context as well as a classification one. Anytime videos, pseudo-codes and the publications related to the thesis are publicly available at: http://www.rfai.li.univ-tours.fr/ PagesPerso/zabuaisheh/home.html. The thesis is also publicly available at: http://www.rfai.li.univ-tours.fr/Documents/Articles_RFAI/PhD2016zeina.pdf","PeriodicalId":38711,"journal":{"name":"Electronic Letters on Computer Vision and Image Analysis","volume":"94 1","pages":"13-15"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronic Letters on Computer Vision and Image Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5565/REV/ELCVIA.986","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Computer Science","Score":null,"Total":0}

引用次数: 2

Abstract

Due to the inherent genericity of graph-based representations, and thanks to the improvement of computer capacities, structural representations have become more and more popular in the field of Pattern Recognition (PR). In a graph-based representation, vertices and their attributes describe objects (or part of them) while edges represent interrelationships between the objects. Representing objects by graphs turns the problem of object comparison into graph matching (GM) where correspondences between vertices and edges of two graphs have to be found. In the domain of GM, over the last decade, Graph Edit Distance (GED) has been given a specific attention due to its flexibility to match many types of graphs. GED has been applied to a wide range of specific applications from molecule recognition to image classification. Researchers have shed light on the approximate methods that can find suboptimal solutions hopefully close to the optimal ones but the gap between optimal and suboptimal solutions has not been deeply studied yet. For that reason, in this thesis, we focus on exact GED algorithms. Unfortunately, exact GED methods have an exponential complexity. Thus, coming up with an exact GED algorithm that can be scaled up to match graphs involved in PR tasks is a great challenge. Two promising ways to cut-off computational time are search space pruning and distributed algorithms. To this end, we first propose a depth-first GED algorithm which requires less memory and search time. An evaluation of all possible solutions is performed without explicitly enumerating all of them. Candidates are discarded using an upper and lower bounds strategy. To find a trade-off between speed and optimality, we describe how to convert the proposed depth-first GED method into an anytime one that is capable of delivering a first solution very quickly. It also can find a list of improved solutions and eventually converges to the optimal solution instead of providing one and only one solution (i.e., the optimal solution). With the delight of more time, anytime methods can also reach the optimal solution. To illustrate the usage of anytime GM algorithms, we convert our depth-first GED algorithm into an anytime one. We analyze the properties of such methods to solve GM problems and consider the performance in terms of accuracy of the provided solution compared to the optimal or the best one found by a state-of-the-art methods. This thesis is also considered as a first attempt to reduce the run time of exact GED methods using parallel and distributed fashions. Two parallel and distributed GED approaches are put forward; both of them are based on the depth-first GED method. The search space is decomposed into smaller search trees which are solved independently in a parallel or a distributed manner. To benchmark the proposed GED methods, we propose not only assessing GED methods in a classification context but also evaluating them in a graph-level one (i.e., evaluating their distance and matchin accuracy). Due to the exponential complexity of exact GED algorithms and in order to obtain this kind of information about methods, we propose analyzing the behavior of the eight compared methods under time and memory constraints. In addition to the performance evaluations metrics, we propose a graph database repository dedicated to GED. In this repository, we add graph-level information to well-known and publicly used databases. Added information consists of the best found edit distance of each pair of graphs as well as their vertex-to-vertex and edge-to-edge mappings corresponding to the best found distance. This information helps in assessing the feasibility of exact and approximate GED methods. This thesis brings into question the usual evidences saying that it is impossible to use exact errortolerant GM methods in real-world applications when matching large graphs, or even in a classification context. However, we argue and show that a new type of GM, referred to as anytime methods, can be successful in a graph-level context as well as a classification one. Anytime videos, pseudo-codes and the publications related to the thesis are publicly available at: http://www.rfai.li.univ-tours.fr/ PagesPerso/zabuaisheh/home.html. The thesis is also publicly available at: http://www.rfai.li.univ-tours.fr/Documents/Articles_RFAI/PhD2016zeina.pdf

查看原文本刊更多论文

任意时间和分布式图匹配方法

由于基于图的表示具有固有的通用性，加之计算机能力的提高，结构表示在模式识别领域得到了越来越广泛的应用。在基于图的表示中，顶点及其属性描述对象(或对象的一部分)，而边表示对象之间的相互关系。用图表示对象将对象比较问题转化为图匹配(GM)，需要找到两个图的顶点和边之间的对应关系。在GM领域，在过去的十年中，图编辑距离(GED)由于其匹配多种类型图的灵活性而受到了特别的关注。GED已被广泛应用于从分子识别到图像分类的特定应用。研究人员已经提出了一些近似方法，可以找到希望接近最优解的次优解，但最优解与次优解之间的差距尚未得到深入研究。因此，在本论文中，我们将重点研究精确的GED算法。不幸的是，精确的GED方法具有指数复杂度。因此，想出一个精确的GED算法，并将其放大以匹配PR任务中涉及的图表是一个巨大的挑战。截断计算时间的两种有前途的方法是搜索空间剪枝和分布式算法。为此，我们首先提出了一种需要较少内存和搜索时间的深度优先的GED算法。执行所有可能解决方案的评估，而不显式枚举所有解决方案。使用上界和下界策略丢弃候选对象。为了在速度和最优性之间找到平衡，我们描述了如何将提出的深度优先的GED方法转换为能够非常快速地交付第一个解决方案的任何时间的方法。它还可以找到一个改进解的列表，并最终收敛到最优解，而不是提供一个且只有一个解(即最优解)。随着时间的增加，任何时间的方法也可以达到最优解。为了说明随时GM算法的用法，我们将深度优先的GED算法转换为随时GM算法。我们分析了这些解决GM问题的方法的性质，并考虑了所提供解决方案与最优或最佳解决方案相比的准确性。本文也被认为是第一次尝试使用并行和分布式的方式来减少精确GED方法的运行时间。提出了两种并行和分布式GED方法;它们都是基于深度优先的GED方法。将搜索空间分解为更小的搜索树，以并行或分布式的方式独立求解。为了对所提出的GED方法进行基准测试，我们建议不仅在分类上下文中评估GED方法，而且在图级上下文中评估它们(即评估它们的距离和匹配精度)。由于精确的GED算法具有指数复杂度，为了获得关于方法的此类信息，我们提出了在时间和内存约束下分析8种比较方法的行为。除了性能评估指标之外，我们还提出了一个专用于GED的图形数据库存储库。在这个存储库中，我们将图级信息添加到知名的和公开使用的数据库中。添加的信息包括每对图的最佳发现编辑距离以及与最佳发现距离对应的顶点到顶点和边到边的映射。这些信息有助于评估精确和近似GED方法的可行性。这篇论文对通常的证据提出了质疑，这些证据表明，在现实世界的应用中，当匹配大图时，甚至在分类上下文中，不可能使用精确的容错GM方法。然而，我们认为并证明了一种新的通用方法，即任何时间方法，可以在图级上下文中以及分类上下文中取得成功。任何时间视频、伪代码和与论文相关的出版物都可以在http://www.rfai.li.univ-tours.fr/ PagesPerso/zabuaisheh/home.html上公开获取。该论文也可以在http://www.rfai.li.univ-tours.fr/Documents/Articles_RFAI/PhD2016zeina.pdf上公开获取

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊