Faten A. Elshwimy, Alsayed Algergawy, A. Sarhan, E. Sallam
{"title":"基于广义均值的模式匹配中相似测度的聚合","authors":"Faten A. Elshwimy, Alsayed Algergawy, A. Sarhan, E. Sallam","doi":"10.1109/ICDEW.2014.6818306","DOIUrl":null,"url":null,"abstract":"Schema matching represents a critical step to integrate heterogeneous e-Business and shared-data applications. Most existing schema matching approaches rely heavily on similarity-based techniques, which attempt to discover correspondences based on various element similarity measures, each computed by an individual base matcher. It has been accepted that aggregating results of multiple base matchers is a promising technique to obtain more accurate matching correspondences. A number of current matching systems use experimental weights for aggregation of similarities among different element matchers while others use machine learning approaches to find optimal weights that should be assigned to different matchers. However, both approaches have their own deficiencies. To overcome the limitations of existing aggregation strategies and to achieve better performance, in this paper, we propose a new aggregation strategy, called the AHGM strategy, which aggregates multiple element matchers based on the concept of generalized mean. In particular, we first develop a practical way to obtain optimal weights that will be assigned to each associated matcher for the given aggregation task. We then use these weights in our aggregation method to improve the performance of matcher combining. To validate the performance of the proposed strategy, we conducted a set of experiments, and the obtained results are encouraging.","PeriodicalId":302600,"journal":{"name":"2014 IEEE 30th International Conference on Data Engineering Workshops","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Aggregation of similarity measures in schema matching based on generalized mean\",\"authors\":\"Faten A. Elshwimy, Alsayed Algergawy, A. Sarhan, E. Sallam\",\"doi\":\"10.1109/ICDEW.2014.6818306\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Schema matching represents a critical step to integrate heterogeneous e-Business and shared-data applications. Most existing schema matching approaches rely heavily on similarity-based techniques, which attempt to discover correspondences based on various element similarity measures, each computed by an individual base matcher. It has been accepted that aggregating results of multiple base matchers is a promising technique to obtain more accurate matching correspondences. A number of current matching systems use experimental weights for aggregation of similarities among different element matchers while others use machine learning approaches to find optimal weights that should be assigned to different matchers. However, both approaches have their own deficiencies. To overcome the limitations of existing aggregation strategies and to achieve better performance, in this paper, we propose a new aggregation strategy, called the AHGM strategy, which aggregates multiple element matchers based on the concept of generalized mean. In particular, we first develop a practical way to obtain optimal weights that will be assigned to each associated matcher for the given aggregation task. We then use these weights in our aggregation method to improve the performance of matcher combining. To validate the performance of the proposed strategy, we conducted a set of experiments, and the obtained results are encouraging.\",\"PeriodicalId\":302600,\"journal\":{\"name\":\"2014 IEEE 30th International Conference on Data Engineering Workshops\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE 30th International Conference on Data Engineering Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDEW.2014.6818306\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 30th International Conference on Data Engineering Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDEW.2014.6818306","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Aggregation of similarity measures in schema matching based on generalized mean
Schema matching represents a critical step to integrate heterogeneous e-Business and shared-data applications. Most existing schema matching approaches rely heavily on similarity-based techniques, which attempt to discover correspondences based on various element similarity measures, each computed by an individual base matcher. It has been accepted that aggregating results of multiple base matchers is a promising technique to obtain more accurate matching correspondences. A number of current matching systems use experimental weights for aggregation of similarities among different element matchers while others use machine learning approaches to find optimal weights that should be assigned to different matchers. However, both approaches have their own deficiencies. To overcome the limitations of existing aggregation strategies and to achieve better performance, in this paper, we propose a new aggregation strategy, called the AHGM strategy, which aggregates multiple element matchers based on the concept of generalized mean. In particular, we first develop a practical way to obtain optimal weights that will be assigned to each associated matcher for the given aggregation task. We then use these weights in our aggregation method to improve the performance of matcher combining. To validate the performance of the proposed strategy, we conducted a set of experiments, and the obtained results are encouraging.