{"title":"关系、属性和数据值上的模式匹配","authors":"Aibo Tian, M. Kejriwal, Daniel P. Miranker","doi":"10.1145/2618243.2618248","DOIUrl":null,"url":null,"abstract":"Automatic schema matching algorithms are typically only concerned with finding attribute correspondences. However, real world data integration problems often require matchings whose arguments span all three types of elements in relational databases: relation, attribute and data value. This paper introduces the definitions and semantics of three additional correspondence types concerning both schema and data values. These correspondences cover the higher-order mappings identified in a seminal paper by Krishnamurthy, Litwin, and Kent. It is shown that these correspondences can be automatically translated to tuple generating dependencies (tgds), and thus this research is compatible with data integration applications that leverage tgds.\n Two methods for automatically identifying these correspondences are developed. One requires a limited number of duplicates across data sources. The other is a general instance-based method with no such requirement. Experiments conducted on four real world data sets demonstrate the effectiveness of the methods.","PeriodicalId":74773,"journal":{"name":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","volume":"10 1","pages":"28:1-28:12"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Schema matching over relations, attributes, and data values\",\"authors\":\"Aibo Tian, M. Kejriwal, Daniel P. Miranker\",\"doi\":\"10.1145/2618243.2618248\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Automatic schema matching algorithms are typically only concerned with finding attribute correspondences. However, real world data integration problems often require matchings whose arguments span all three types of elements in relational databases: relation, attribute and data value. This paper introduces the definitions and semantics of three additional correspondence types concerning both schema and data values. These correspondences cover the higher-order mappings identified in a seminal paper by Krishnamurthy, Litwin, and Kent. It is shown that these correspondences can be automatically translated to tuple generating dependencies (tgds), and thus this research is compatible with data integration applications that leverage tgds.\\n Two methods for automatically identifying these correspondences are developed. One requires a limited number of duplicates across data sources. The other is a general instance-based method with no such requirement. Experiments conducted on four real world data sets demonstrate the effectiveness of the methods.\",\"PeriodicalId\":74773,\"journal\":{\"name\":\"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management\",\"volume\":\"10 1\",\"pages\":\"28:1-28:12\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2618243.2618248\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific and statistical database management : International Conference, SSDBM ... : proceedings. International Conference on Scientific and Statistical Database Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2618243.2618248","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Schema matching over relations, attributes, and data values
Automatic schema matching algorithms are typically only concerned with finding attribute correspondences. However, real world data integration problems often require matchings whose arguments span all three types of elements in relational databases: relation, attribute and data value. This paper introduces the definitions and semantics of three additional correspondence types concerning both schema and data values. These correspondences cover the higher-order mappings identified in a seminal paper by Krishnamurthy, Litwin, and Kent. It is shown that these correspondences can be automatically translated to tuple generating dependencies (tgds), and thus this research is compatible with data integration applications that leverage tgds.
Two methods for automatically identifying these correspondences are developed. One requires a limited number of duplicates across data sources. The other is a general instance-based method with no such requirement. Experiments conducted on four real world data sets demonstrate the effectiveness of the methods.