Internal Filtering Approach toward Efficiency Optimization of Matching Large Scale XML Schemas

2013 16th International Conference on Network-Based Information Systems Pub Date : 2013-09-04 DOI:10.1109/NBiS.2013.77

Ahmad Abdullah Alqarni, E. Pardede

{"title":"Internal Filtering Approach toward Efficiency Optimization of Matching Large Scale XML Schemas","authors":"Ahmad Abdullah Alqarni, E. Pardede","doi":"10.1109/NBiS.2013.77","DOIUrl":null,"url":null,"abstract":"XML Schema matching plays a significant role in the integration of different XML Schemas by finding similar corresponding elements. XML Schema elements' properties and their relation to surrounding elements play significant role in improving the quality of matching process. Investigating all measures for each element in two schemas can result in a long execution time, which reduces the performance of the matching process. The feasibility of performance is becoming significant in particular in large scale XML Schema with all that features and surroundings. Since internal features of an element represents between 40-60% of the total similarity value, it should be utilised to filter elements that yield lower internal similarity value based on a predefined threshold. Thus, we propose to use element's internal features as a filter to exclude any element that is lower to certain predefined threshold. We also present an optimum threshold that can be used in the filtering approach. The idea is to detect using the internal features the elements that are highly likely to be dissimilar and excluded them from the next phase of element's context (element's surroundings) investigations. The outcome of imposing this approach is promising not only for improving the matching efficiency per see, but also for maintaining an acceptable quality results that are very close to non-filter approach.","PeriodicalId":261268,"journal":{"name":"2013 16th International Conference on Network-Based Information Systems","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 16th International Conference on Network-Based Information Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NBiS.2013.77","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

XML Schema matching plays a significant role in the integration of different XML Schemas by finding similar corresponding elements. XML Schema elements' properties and their relation to surrounding elements play significant role in improving the quality of matching process. Investigating all measures for each element in two schemas can result in a long execution time, which reduces the performance of the matching process. The feasibility of performance is becoming significant in particular in large scale XML Schema with all that features and surroundings. Since internal features of an element represents between 40-60% of the total similarity value, it should be utilised to filter elements that yield lower internal similarity value based on a predefined threshold. Thus, we propose to use element's internal features as a filter to exclude any element that is lower to certain predefined threshold. We also present an optimum threshold that can be used in the filtering approach. The idea is to detect using the internal features the elements that are highly likely to be dissimilar and excluded them from the next phase of element's context (element's surroundings) investigations. The outcome of imposing this approach is promising not only for improving the matching efficiency per see, but also for maintaining an acceptable quality results that are very close to non-filter approach.

查看原文本刊更多论文

面向大规模XML模式匹配效率优化的内部过滤方法

XML模式匹配通过查找相似的对应元素，在不同XML模式的集成中起着重要作用。XML模式元素的属性及其与周围元素的关系对提高匹配过程的质量起着重要作用。调查两个模式中每个元素的所有度量可能会导致较长的执行时间，从而降低匹配过程的性能。性能的可行性变得越来越重要，特别是在具有所有这些特性和环境的大规模XML模式中。由于元素的内部特征占总相似度值的40-60%，因此应该根据预定义的阈值来过滤产生较低内部相似度值的元素。因此，我们建议使用元素的内部特征作为过滤器，以排除低于某个预定义阈值的任何元素。我们还提出了一个可用于滤波方法的最佳阈值。这个想法是使用内部特征来检测非常可能不同的元素，并将它们排除在元素上下文(元素周围环境)调查的下一阶段之外。实施这种方法的结果不仅有希望提高每次查看的匹配效率，而且还可以保持非常接近非过滤器方法的可接受的质量结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 16th International Conference on Network-Based Information Systems

自引率

0.00%

发文量