Accelerating similarity-based model matching with subtree equivalence

IF 4.3 2区计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information and Software Technology Pub Date : 2025-09-13 DOI:10.1016/j.infsof.2025.107879

Xiao He , Kai Liu , Yifan Zhang , Huihong He

{"title":"Accelerating similarity-based model matching with subtree equivalence","authors":"Xiao He , Kai Liu , Yifan Zhang , Huihong He","doi":"10.1016/j.infsof.2025.107879","DOIUrl":null,"url":null,"abstract":"<div><h3>Context:</h3><div>Efficient version management of models in model-driven software engineering is vital for modeling tools, necessitating model matching, differencing, and merging to incorporate various model versions. Although similarity-based matching is the most general method, its computational complexity escalates at a cubic rate with the number of elements.</div></div><div><h3>Objective:</h3><div>This paper introduces <span>StEqMatch</span>, a subtree-equivalence-based approach to accelerate similarity model matching, inspired by the observation that consecutive version changes typically impact only a small portion of a model.</div></div><div><h3>Methods:</h3><div><span>StEqMatch</span> initially decomposes a model into a series of subtrees. Rather than performing element-wise matching directly, our approach tries to find equivalent (i.e., either identical or closely similar) subtrees, representing the unchanged portion of a model, thus enabling quick pairing of elements within these subtrees. To effectively identify equivalent subtrees, this paper develops two hash functions for equality and similarity comparison of model trees.</div></div><div><h3>Results:</h3><div>Experiments using open-source Ecore and UML models indicate that <span>StEqMatch</span> is 1.27 to 22.5 times faster on average compared to the state-of-the-art model matching tool while reducing the error rates in most cases.</div></div><div><h3>Conclusion:</h3><div><span>StEqMatch</span> combines subtree matching and element-wise matching, and can improve the efficiency and the quality of similarity-based model matching.</div></div>","PeriodicalId":54983,"journal":{"name":"Information and Software Technology","volume":"188 ","pages":"Article 107879"},"PeriodicalIF":4.3000,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information and Software Technology","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950584925002186","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Context:

Efficient version management of models in model-driven software engineering is vital for modeling tools, necessitating model matching, differencing, and merging to incorporate various model versions. Although similarity-based matching is the most general method, its computational complexity escalates at a cubic rate with the number of elements.

Objective:

This paper introduces StEqMatch, a subtree-equivalence-based approach to accelerate similarity model matching, inspired by the observation that consecutive version changes typically impact only a small portion of a model.

Methods:

StEqMatch initially decomposes a model into a series of subtrees. Rather than performing element-wise matching directly, our approach tries to find equivalent (i.e., either identical or closely similar) subtrees, representing the unchanged portion of a model, thus enabling quick pairing of elements within these subtrees. To effectively identify equivalent subtrees, this paper develops two hash functions for equality and similarity comparison of model trees.

Results:

Experiments using open-source Ecore and UML models indicate that StEqMatch is 1.27 to 22.5 times faster on average compared to the state-of-the-art model matching tool while reducing the error rates in most cases.

Conclusion:

StEqMatch combines subtree matching and element-wise matching, and can improve the efficiency and the quality of similarity-based model matching.

查看原文本刊更多论文

加速基于相似度的子树等价模型匹配

上下文：在模型驱动的软件工程中，模型的有效版本管理对于建模工具是至关重要的，它需要模型匹配、差异和合并来合并各种模型版本。虽然基于相似度的匹配是最常用的方法，但其计算复杂度随着元素数量的增加而以三次速率递增。目的：本文介绍了StEqMatch，一种基于子树等效的方法来加速相似模型匹配，其灵感来自于连续版本变化通常只影响模型的一小部分。方法：StEqMatch首先将模型分解为一系列子树。我们的方法不是直接执行元素匹配，而是试图找到等价的（即，相同的或非常相似的）子树，表示模型中未改变的部分，从而使这些子树中的元素能够快速配对。为了有效地识别等效子树，本文开发了两个模型树的相等性和相似度比较的哈希函数。结果：使用开源Ecore和UML模型的实验表明，与最先进的模型匹配工具相比，StEqMatch的平均速度要快1.27到22.5倍，同时在大多数情况下降低了错误率。结论：StEqMatch结合了子树匹配和元素匹配，提高了基于相似度的模型匹配的效率和质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information and Software Technology 工程技术-计算机：软件工程

CiteScore

9.10

自引率

7.70%

发文量

164

审稿时长

9.6 weeks

期刊介绍： Information and Software Technology is the international archival journal focusing on research and experience that contributes to the improvement of software development practices. The journal''s scope includes methods and techniques to better engineer software and manage its development. Articles submitted for review should have a clear component of software engineering or address ways to improve the engineering and management of software development. Areas covered by the journal include: • Software management, quality and metrics, • Software processes, • Software architecture, modelling, specification, design and programming • Functional and non-functional software requirements • Software testing and verification & validation • Empirical studies of all aspects of engineering and managing software development Short Communications is a new section dedicated to short papers addressing new ideas, controversial opinions, "Negative" results and much more. Read the Guide for authors for more information. The journal encourages and welcomes submissions of systematic literature studies (reviews and maps) within the scope of the journal. Information and Software Technology is the premiere outlet for systematic literature studies in software engineering.