F. Duchateau, Rémi Coletta, Zohra Bellahsene, Renée J. Miller
{"title":"(Not) yet another matcher","authors":"F. Duchateau, Rémi Coletta, Zohra Bellahsene, Renée J. Miller","doi":"10.1145/1645953.1646165","DOIUrl":null,"url":null,"abstract":"Discovering correspondences between schema elements is a crucial task for data integration. Most schema matching tools are semi-automatic, e.g. an expert must tune some parameters (thresholds, weights, etc.). They mainly use several methods to combine and aggregate similarity measures. However, their quality results often decrease when one requires to integrate a new similarity measure or when matching particular domain schemas. This paper describes YAM (Yet Another Matcher), which is a schema matcher factory. Indeed, it enables the generation of a dedicated matcher for a given schema matching scenario, according to user inputs. Our approach is based on machine learning since schema matchers can be seen as classifiers. Several bunches of experiments run against matchers generated by YAM and traditional matching tools show how our approach is able to generate the best matcher for a given scenario.","PeriodicalId":286251,"journal":{"name":"Proceedings of the 18th ACM conference on Information and knowledge management","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"36","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 18th ACM conference on Information and knowledge management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1645953.1646165","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 36
Abstract
Discovering correspondences between schema elements is a crucial task for data integration. Most schema matching tools are semi-automatic, e.g. an expert must tune some parameters (thresholds, weights, etc.). They mainly use several methods to combine and aggregate similarity measures. However, their quality results often decrease when one requires to integrate a new similarity measure or when matching particular domain schemas. This paper describes YAM (Yet Another Matcher), which is a schema matcher factory. Indeed, it enables the generation of a dedicated matcher for a given schema matching scenario, according to user inputs. Our approach is based on machine learning since schema matchers can be seen as classifiers. Several bunches of experiments run against matchers generated by YAM and traditional matching tools show how our approach is able to generate the best matcher for a given scenario.
发现模式元素之间的对应关系是数据集成的关键任务。大多数模式匹配工具都是半自动的,例如,专家必须调整一些参数(阈值、权重等)。它们主要使用几种方法来组合和聚合相似度量。然而,当需要集成新的相似性度量或匹配特定领域模式时,它们的质量结果往往会下降。本文描述了一个模式匹配器工厂YAM (Yet Another Matcher)。实际上,它支持根据用户输入为给定的模式匹配场景生成专用的匹配器。我们的方法是基于机器学习的,因为模式匹配器可以被视为分类器。针对YAM和传统匹配工具生成的匹配器进行的几组实验表明,我们的方法能够为给定场景生成最佳匹配器。