{"title":"扩展SQL中语义数据融合的代数运算符支持","authors":"S. Hosain, H. Jamil","doi":"10.1109/UKRICIS.2010.5898129","DOIUrl":null,"url":null,"abstract":"One of the basic operations required to gather more information about an object is called information aggregation or data fusion. The process requires recognition of a semantic object and gathering the new information into the collection that already exists for that object. Another related operation is collecting a set of distinct semantic objects that are similar. These operations become complicated in the presence of schema and extent heterogeneity and semantic similarity. Although a rich body of research addressed these issues in the literature, a database language support is yet available possibly because an algebraic formulation of these concepts was absent. An algebraic characterization is needed for query plan generation, optimization and query processing. In this paper, we propose two new binary operators called link (λ) and combine (χ) that capture the spirit of vertical and horizontal data fusion. The proposed operators leverage the development in schema matching and key identification technologies by casting them as user selectable functions μ and κ. We show that link and combine are generalized versions of traditional join and union operations. We also propose two extensions of SQL that exploits these two operators and opens up many optimization possibilities. We also point out that link and combine are also useful for semantic data integration and are currently being used in LifeDB data management system for Life Sciences applications.","PeriodicalId":359942,"journal":{"name":"2010 IEEE 9th International Conference on Cyberntic Intelligent Systems","volume":"108 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Algebraic operator support for semantic data fusion in extended SQL\",\"authors\":\"S. Hosain, H. Jamil\",\"doi\":\"10.1109/UKRICIS.2010.5898129\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the basic operations required to gather more information about an object is called information aggregation or data fusion. The process requires recognition of a semantic object and gathering the new information into the collection that already exists for that object. Another related operation is collecting a set of distinct semantic objects that are similar. These operations become complicated in the presence of schema and extent heterogeneity and semantic similarity. Although a rich body of research addressed these issues in the literature, a database language support is yet available possibly because an algebraic formulation of these concepts was absent. An algebraic characterization is needed for query plan generation, optimization and query processing. In this paper, we propose two new binary operators called link (λ) and combine (χ) that capture the spirit of vertical and horizontal data fusion. The proposed operators leverage the development in schema matching and key identification technologies by casting them as user selectable functions μ and κ. We show that link and combine are generalized versions of traditional join and union operations. We also propose two extensions of SQL that exploits these two operators and opens up many optimization possibilities. We also point out that link and combine are also useful for semantic data integration and are currently being used in LifeDB data management system for Life Sciences applications.\",\"PeriodicalId\":359942,\"journal\":{\"name\":\"2010 IEEE 9th International Conference on Cyberntic Intelligent Systems\",\"volume\":\"108 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 IEEE 9th International Conference on Cyberntic Intelligent Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/UKRICIS.2010.5898129\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE 9th International Conference on Cyberntic Intelligent Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UKRICIS.2010.5898129","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Algebraic operator support for semantic data fusion in extended SQL
One of the basic operations required to gather more information about an object is called information aggregation or data fusion. The process requires recognition of a semantic object and gathering the new information into the collection that already exists for that object. Another related operation is collecting a set of distinct semantic objects that are similar. These operations become complicated in the presence of schema and extent heterogeneity and semantic similarity. Although a rich body of research addressed these issues in the literature, a database language support is yet available possibly because an algebraic formulation of these concepts was absent. An algebraic characterization is needed for query plan generation, optimization and query processing. In this paper, we propose two new binary operators called link (λ) and combine (χ) that capture the spirit of vertical and horizontal data fusion. The proposed operators leverage the development in schema matching and key identification technologies by casting them as user selectable functions μ and κ. We show that link and combine are generalized versions of traditional join and union operations. We also propose two extensions of SQL that exploits these two operators and opens up many optimization possibilities. We also point out that link and combine are also useful for semantic data integration and are currently being used in LifeDB data management system for Life Sciences applications.