Using Element Clustering to Increase the Efficiency of XML Schema Matching

22nd International Conference on Data Engineering Workshops (ICDEW'06) Pub Date : 2006-04-03 DOI:10.1109/ICDEW.2006.159

M. Smiljanic, M. V. Keulen, W. Jonker

引用次数: 33

Abstract

Schema matching attempts to discover semantic mappings between elements of two schemas. Elements are cross compared using various heuristics (e.g., name, data-type, and structure similarity). Seen from a broader perspective, the schema matching problem is a combinatorial problem with an exponential complexity. This makes the naive matching algorithms for large schemas prohibitively inefficient. In this paper we propose a clustering based technique for improving the efficiency of large scale schema matching. The technique inserts clustering as an intermediate step into existing schema matching algorithms. Clustering partitions schemas and reduces the overall matching load, and creates a possibility to trade between the efficiency and effectiveness. The technique can be used in addition to other optimization techniques. In the paper we describe the technique, validate the performance of one implementation of the technique, and open directions for future research.

查看原文本刊更多论文

利用元素聚类提高XML模式匹配效率

模式匹配试图发现两个模式元素之间的语义映射。使用各种启发式方法(例如，名称、数据类型和结构相似性)交叉比较元素。从广义上看，模式匹配问题是一个具有指数复杂度的组合问题。这使得用于大型模式的朴素匹配算法的效率非常低。本文提出了一种基于聚类的方法来提高大规模模式匹配的效率。该技术将聚类作为中间步骤插入现有的模式匹配算法中。集群对模式进行分区，减少总体匹配负载，并在效率和有效性之间进行权衡。该技术可以与其他优化技术一起使用。在本文中，我们描述了该技术，验证了该技术的一个实现的性能，并为未来的研究开辟了方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

22nd International Conference on Data Engineering Workshops (ICDEW'06)

自引率

0.00%

发文量