使用动态相似度保持散列加速基于相似度的模型匹配

Xiaoyu He, Letian Tang, Yutong Li
{"title":"使用动态相似度保持散列加速基于相似度的模型匹配","authors":"Xiaoyu He, Letian Tang, Yutong Li","doi":"10.1145/3550355.3552406","DOIUrl":null,"url":null,"abstract":"Similarity-based model matching is the foundation of model versioning. It pairs model elements based on a distance metric (e.g., edit distance). Because it is expensive to calculate the distance between two elements, a similarity-based matcher usually suffers from performance issues when the model size increases. This paper proposes a hash-based approach to accelerate similarity-based model matching. Firstly, we design a novel similarity-preserving hash function that maps a model element to a 64-bit hash value. If two elements are similar, their hashes are also very close. Secondly, we propose a 3-layer index structure and a query algorithm to quickly filter out impossible candidates for the element to be matched based on their hashes. For the remaining candidates, we employ the classical similarity-based matching algorithm to determine the final matches. Our approach has been realized and integrated into EMF Compare. The evaluation results show that our hash function is effective to preserve the similarity between model elements and our matching approach reduces 16%--72% of time costs while assuring the matching results consistent with EMF Compare.","PeriodicalId":303547,"journal":{"name":"Proceedings of the 25th International Conference on Model Driven Engineering Languages and Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accelerating similarity-based model matching using on-the-fly similarity preserving hashing\",\"authors\":\"Xiaoyu He, Letian Tang, Yutong Li\",\"doi\":\"10.1145/3550355.3552406\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Similarity-based model matching is the foundation of model versioning. It pairs model elements based on a distance metric (e.g., edit distance). Because it is expensive to calculate the distance between two elements, a similarity-based matcher usually suffers from performance issues when the model size increases. This paper proposes a hash-based approach to accelerate similarity-based model matching. Firstly, we design a novel similarity-preserving hash function that maps a model element to a 64-bit hash value. If two elements are similar, their hashes are also very close. Secondly, we propose a 3-layer index structure and a query algorithm to quickly filter out impossible candidates for the element to be matched based on their hashes. For the remaining candidates, we employ the classical similarity-based matching algorithm to determine the final matches. Our approach has been realized and integrated into EMF Compare. The evaluation results show that our hash function is effective to preserve the similarity between model elements and our matching approach reduces 16%--72% of time costs while assuring the matching results consistent with EMF Compare.\",\"PeriodicalId\":303547,\"journal\":{\"name\":\"Proceedings of the 25th International Conference on Model Driven Engineering Languages and Systems\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 25th International Conference on Model Driven Engineering Languages and Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3550355.3552406\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th International Conference on Model Driven Engineering Languages and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3550355.3552406","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

基于相似度的模型匹配是模型版本控制的基础。它基于距离度量(例如,编辑距离)对模型元素进行配对。由于计算两个元素之间的距离是非常昂贵的,所以当模型大小增加时,基于相似性的匹配器通常会遇到性能问题。本文提出了一种基于哈希的方法来加速基于相似度的模型匹配。首先,我们设计了一个新的保持相似度的哈希函数,将模型元素映射到64位哈希值。如果两个元素相似,它们的哈希值也非常接近。其次,我们提出了一种3层索引结构和查询算法,可以根据元素的哈希值快速过滤掉不可能匹配的元素。对于剩余的候选,我们采用经典的基于相似性的匹配算法来确定最终的匹配。我们的方法已经实现并集成到EMF比较中。评估结果表明,我们的哈希函数有效地保持了模型元素之间的相似性,我们的匹配方法在确保匹配结果与EMF比较一致的同时减少了16%- 72%的时间成本。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Accelerating similarity-based model matching using on-the-fly similarity preserving hashing
Similarity-based model matching is the foundation of model versioning. It pairs model elements based on a distance metric (e.g., edit distance). Because it is expensive to calculate the distance between two elements, a similarity-based matcher usually suffers from performance issues when the model size increases. This paper proposes a hash-based approach to accelerate similarity-based model matching. Firstly, we design a novel similarity-preserving hash function that maps a model element to a 64-bit hash value. If two elements are similar, their hashes are also very close. Secondly, we propose a 3-layer index structure and a query algorithm to quickly filter out impossible candidates for the element to be matched based on their hashes. For the remaining candidates, we employ the classical similarity-based matching algorithm to determine the final matches. Our approach has been realized and integrated into EMF Compare. The evaluation results show that our hash function is effective to preserve the similarity between model elements and our matching approach reduces 16%--72% of time costs while assuring the matching results consistent with EMF Compare.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信