ATENA-PRO: Generating Personalized Exploration Notebooks with Constrained Reinforcement Learning

Companion of the 2023 International Conference on Management of Data Pub Date : 2023-06-04 DOI:10.1145/3555041.3589727

Tavor Lipman, T. Milo, Amit Somech

{"title":"ATENA-PRO: Generating Personalized Exploration Notebooks with Constrained Reinforcement Learning","authors":"Tavor Lipman, T. Milo, Amit Somech","doi":"10.1145/3555041.3589727","DOIUrl":null,"url":null,"abstract":"One of the most common, helpful practices of data scientists, when starting the exploration of a given dataset, is to examine existing data exploration notebooks prepared by other data analysts or scientists. These notebooks contain curated sessions of contextually-related query operations that together demonstrate interesting hypotheses and conjectures on the data. Unfortunately,relevant such notebooks, that had been prepared on the same dataset, and in light of thesame analysis task, are often nonexistent or unavailable. In this work, we describe ATENA-PRO, a framework for auto-generating such relevant, personalized exploratory sessions. Using a novel specification language, users first describe their desired output notebook. Our language contains dedicated constructs for contextually connecting future output queries. These specifications are then used as input for a Deep Reinforcement Learning (DRL) engine, which auto-generates the personalized notebook. Our DRL engine relies on an existing, general-purpose, DRL framework for data exploration. However, augmenting the generic framework with user specifications requires overcoming a difficult sparsity challenge, as only a small portion of the possible sessions may be compliant with the specifications. Inspired by solutions for constrained reinforcement learning, we devise a compound, flexible reward scheme as well as specification-aware neural network architecture. Our experimental evaluation shows that the combination of these components allows ATENA-PRO to consistently generate interesting, personalized exploration sessions for various analysis tasks and datasets.","PeriodicalId":161812,"journal":{"name":"Companion of the 2023 International Conference on Management of Data","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Companion of the 2023 International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3555041.3589727","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

One of the most common, helpful practices of data scientists, when starting the exploration of a given dataset, is to examine existing data exploration notebooks prepared by other data analysts or scientists. These notebooks contain curated sessions of contextually-related query operations that together demonstrate interesting hypotheses and conjectures on the data. Unfortunately,relevant such notebooks, that had been prepared on the same dataset, and in light of thesame analysis task, are often nonexistent or unavailable. In this work, we describe ATENA-PRO, a framework for auto-generating such relevant, personalized exploratory sessions. Using a novel specification language, users first describe their desired output notebook. Our language contains dedicated constructs for contextually connecting future output queries. These specifications are then used as input for a Deep Reinforcement Learning (DRL) engine, which auto-generates the personalized notebook. Our DRL engine relies on an existing, general-purpose, DRL framework for data exploration. However, augmenting the generic framework with user specifications requires overcoming a difficult sparsity challenge, as only a small portion of the possible sessions may be compliant with the specifications. Inspired by solutions for constrained reinforcement learning, we devise a compound, flexible reward scheme as well as specification-aware neural network architecture. Our experimental evaluation shows that the combination of these components allows ATENA-PRO to consistently generate interesting, personalized exploration sessions for various analysis tasks and datasets.

查看原文本刊更多论文

ATENA-PRO:使用约束强化学习生成个性化探索笔记本

数据科学家在开始对给定数据集进行探索时，最常见、最有益的做法之一是检查其他数据分析师或科学家准备的现有数据探索笔记本。这些笔记本包含与上下文相关的查询操作的精心策划的会话，它们共同展示了对数据的有趣假设和猜想。不幸的是，根据相同的分析任务，在相同的数据集上准备的相关笔记本往往不存在或不可用。在这项工作中，我们描述了ATENA-PRO，一个自动生成相关的个性化探索会话的框架。使用一种新的规范语言，用户首先描述他们想要的输出笔记本。我们的语言包含专门的结构，用于在上下文中连接未来的输出查询。然后，这些规范被用作深度强化学习(DRL)引擎的输入，该引擎会自动生成个性化的笔记本。我们的DRL引擎依赖于现有的通用DRL框架进行数据探索。然而，用用户规范扩充通用框架需要克服一个困难的稀疏性挑战，因为只有一小部分可能的会话可能符合规范。受约束强化学习解决方案的启发，我们设计了一个复合的、灵活的奖励方案以及规范感知的神经网络架构。我们的实验评估表明，这些组件的组合允许ATENA-PRO始终如一地为各种分析任务和数据集生成有趣的，个性化的探索会话。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Companion of the 2023 International Conference on Management of Data

自引率

0.00%

发文量