{"title":"Exploiting data lineage for parallel optimization in extensible DBMSs","authors":"E. C. Shek, R. Muntz","doi":"10.1109/ICDE.1999.754936","DOIUrl":null,"url":null,"abstract":"Extensibility and high query performance are important requirements of advanced large scale information systems since complex data analysis often requires the use of application-specific operations that have to be introduced by the user issuing the query. Towards the goal of supporting automatic parallelization of queries containing complex user-defined evaluators in an extensible DBMS, we devised a relevance window model to capture the inherent data lineage characteristics of evaluators on multidimensional data sets. Informally, the relevance window of an evaluator defines the scope of influence input data records have on the value of records in the output data space. An evaluator's relevance window constrains the data partitioning opportunities available for an evaluator.","PeriodicalId":236128,"journal":{"name":"Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-03-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.1999.754936","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Extensibility and high query performance are important requirements of advanced large scale information systems since complex data analysis often requires the use of application-specific operations that have to be introduced by the user issuing the query. Towards the goal of supporting automatic parallelization of queries containing complex user-defined evaluators in an extensible DBMS, we devised a relevance window model to capture the inherent data lineage characteristics of evaluators on multidimensional data sets. Informally, the relevance window of an evaluator defines the scope of influence input data records have on the value of records in the output data space. An evaluator's relevance window constrains the data partitioning opportunities available for an evaluator.