Demonstration of ThalamusDB: Answering Complex SQL Queries with Natural Language Predicates on Multi-Modal Data

Companion of the 2023 International Conference on Management of Data Pub Date : 2023-06-04 DOI:10.1145/3555041.3589730

Saehan Jo, Immanuel Trummer

{"title":"Demonstration of ThalamusDB: Answering Complex SQL Queries with Natural Language Predicates on Multi-Modal Data","authors":"Saehan Jo, Immanuel Trummer","doi":"10.1145/3555041.3589730","DOIUrl":null,"url":null,"abstract":"ThalamusDB supports SQL queries with natural language predicates on multi-modal data. Our data model extends the relational model and integrates multi-modal data, including visual, audio, and text data, as columns. Users can write SQL queries including predicates on multi-modal data, described in natural language. In this demonstration, we show how ThalamusDB enables users to query multi-modal data. Visitors can write their own SQL queries on two real-world data sets gathered from Craigslist and YouTube. ThalamusDB has a specialized optimizer that selects execution plans that minimize the overall cost of answering such queries. Query execution involves pre-trained neural models as well as a relational database as processing engines. ThalamusDB collects a limited number of labels for selected data items to translate similarity scores into binary predicate evaluation. Our demonstration enables visitors to compare optimized plans against naive plans in terms of processing latency. ThalamusDB allows users to trade query result precision for reduced processing overheads. Our demonstration interface enables visitors to change the performance objectives and observe their effects on final result precision as well as computation time and number of labeling requests. Similar to online aggregation, our interactive interface allows users to track shrinking error bounds during query execution.","PeriodicalId":161812,"journal":{"name":"Companion of the 2023 International Conference on Management of Data","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion of the 2023 International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3555041.3589730","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

ThalamusDB supports SQL queries with natural language predicates on multi-modal data. Our data model extends the relational model and integrates multi-modal data, including visual, audio, and text data, as columns. Users can write SQL queries including predicates on multi-modal data, described in natural language. In this demonstration, we show how ThalamusDB enables users to query multi-modal data. Visitors can write their own SQL queries on two real-world data sets gathered from Craigslist and YouTube. ThalamusDB has a specialized optimizer that selects execution plans that minimize the overall cost of answering such queries. Query execution involves pre-trained neural models as well as a relational database as processing engines. ThalamusDB collects a limited number of labels for selected data items to translate similarity scores into binary predicate evaluation. Our demonstration enables visitors to compare optimized plans against naive plans in terms of processing latency. ThalamusDB allows users to trade query result precision for reduced processing overheads. Our demonstration interface enables visitors to change the performance objectives and observe their effects on final result precision as well as computation time and number of labeling requests. Similar to online aggregation, our interactive interface allows users to track shrinking error bounds during query execution.

查看原文本刊更多论文

演示ThalamusDB:在多模态数据上用自然语言谓词回答复杂SQL查询

ThalamusDB支持在多模态数据上使用自然语言谓词的SQL查询。我们的数据模型扩展了关系模型，并将多模态数据(包括可视、音频和文本数据)集成为列。用户可以在多模态数据上编写SQL查询，包括用自然语言描述的谓词。在这个演示中，我们将展示ThalamusDB如何使用户能够查询多模态数据。访问者可以对从Craigslist和YouTube收集的两个真实数据集编写自己的SQL查询。ThalamusDB有一个专门的优化器，可以选择执行计划，使回答此类查询的总成本最小化。查询执行涉及预先训练的神经模型以及作为处理引擎的关系数据库。ThalamusDB为选定的数据项收集有限数量的标签，将相似性得分转换为二元谓词评估。我们的演示使访问者能够在处理延迟方面比较优化计划和原始计划。ThalamusDB允许用户以查询结果的精度来降低处理开销。我们的演示界面允许访问者更改性能目标，并观察它们对最终结果精度、计算时间和标记请求数量的影响。与在线聚合类似，我们的交互界面允许用户在查询执行期间跟踪缩小的错误界限。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Companion of the 2023 International Conference on Management of Data

自引率

0.00%

发文量