Using Element Clustering to Increase the Efficiency of XML Schema Matching

M. Smiljanic, M. V. Keulen, W. Jonker
{"title":"Using Element Clustering to Increase the Efficiency of XML Schema Matching","authors":"M. Smiljanic, M. V. Keulen, W. Jonker","doi":"10.1109/ICDEW.2006.159","DOIUrl":null,"url":null,"abstract":"Schema matching attempts to discover semantic mappings between elements of two schemas. Elements are cross compared using various heuristics (e.g., name, data-type, and structure similarity). Seen from a broader perspective, the schema matching problem is a combinatorial problem with an exponential complexity. This makes the naive matching algorithms for large schemas prohibitively inefficient. In this paper we propose a clustering based technique for improving the efficiency of large scale schema matching. The technique inserts clustering as an intermediate step into existing schema matching algorithms. Clustering partitions schemas and reduces the overall matching load, and creates a possibility to trade between the efficiency and effectiveness. The technique can be used in addition to other optimization techniques. In the paper we describe the technique, validate the performance of one implementation of the technique, and open directions for future research.","PeriodicalId":331953,"journal":{"name":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"22nd International Conference on Data Engineering Workshops (ICDEW'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDEW.2006.159","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 33

Abstract

Schema matching attempts to discover semantic mappings between elements of two schemas. Elements are cross compared using various heuristics (e.g., name, data-type, and structure similarity). Seen from a broader perspective, the schema matching problem is a combinatorial problem with an exponential complexity. This makes the naive matching algorithms for large schemas prohibitively inefficient. In this paper we propose a clustering based technique for improving the efficiency of large scale schema matching. The technique inserts clustering as an intermediate step into existing schema matching algorithms. Clustering partitions schemas and reduces the overall matching load, and creates a possibility to trade between the efficiency and effectiveness. The technique can be used in addition to other optimization techniques. In the paper we describe the technique, validate the performance of one implementation of the technique, and open directions for future research.
利用元素聚类提高XML模式匹配效率
模式匹配试图发现两个模式元素之间的语义映射。使用各种启发式方法(例如,名称、数据类型和结构相似性)交叉比较元素。从广义上看,模式匹配问题是一个具有指数复杂度的组合问题。这使得用于大型模式的朴素匹配算法的效率非常低。本文提出了一种基于聚类的方法来提高大规模模式匹配的效率。该技术将聚类作为中间步骤插入现有的模式匹配算法中。集群对模式进行分区,减少总体匹配负载,并在效率和有效性之间进行权衡。该技术可以与其他优化技术一起使用。在本文中,我们描述了该技术,验证了该技术的一个实现的性能,并为未来的研究开辟了方向。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信