{"title":"现代数据引擎中的用户定义函数","authors":"Ioannis Foufoulas, A. Simitsis","doi":"10.1109/ICDE55515.2023.00276","DOIUrl":null,"url":null,"abstract":"Modern data management applications involve complex processing tasks over large volumes of data. Although this falls naturally within the scope of relational databases, many such tasks cannot be expressed in SQL and require additional expressive power achieved via user-defined functions (UDFs). However, efficient processing of UDFs in data engines hinge on dealing with the impedance mismatch between UDF execution and SQL processing. In recent years, the problem of efficient UDF execution in modern data engines has gained significant traction. In this tutorial, we present recent advancements in this area, involving a broad scope of solutions ranging from algebraic, cost-based optimization to low level, physical query optimization, compilation, and execution. We also describe limitations and open issues, and discuss promising future research directions.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"User-Defined Functions in Modern Data Engines\",\"authors\":\"Ioannis Foufoulas, A. Simitsis\",\"doi\":\"10.1109/ICDE55515.2023.00276\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern data management applications involve complex processing tasks over large volumes of data. Although this falls naturally within the scope of relational databases, many such tasks cannot be expressed in SQL and require additional expressive power achieved via user-defined functions (UDFs). However, efficient processing of UDFs in data engines hinge on dealing with the impedance mismatch between UDF execution and SQL processing. In recent years, the problem of efficient UDF execution in modern data engines has gained significant traction. In this tutorial, we present recent advancements in this area, involving a broad scope of solutions ranging from algebraic, cost-based optimization to low level, physical query optimization, compilation, and execution. We also describe limitations and open issues, and discuss promising future research directions.\",\"PeriodicalId\":434744,\"journal\":{\"name\":\"2023 IEEE 39th International Conference on Data Engineering (ICDE)\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 IEEE 39th International Conference on Data Engineering (ICDE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDE55515.2023.00276\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE55515.2023.00276","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Modern data management applications involve complex processing tasks over large volumes of data. Although this falls naturally within the scope of relational databases, many such tasks cannot be expressed in SQL and require additional expressive power achieved via user-defined functions (UDFs). However, efficient processing of UDFs in data engines hinge on dealing with the impedance mismatch between UDF execution and SQL processing. In recent years, the problem of efficient UDF execution in modern data engines has gained significant traction. In this tutorial, we present recent advancements in this area, involving a broad scope of solutions ranging from algebraic, cost-based optimization to low level, physical query optimization, compilation, and execution. We also describe limitations and open issues, and discuss promising future research directions.