Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.)最新文献_第3页

Introducing quest: a query-driven framework to explain classification models on tabular data 引入quest:一个查询驱动的框架，用于解释表格数据上的分类模型

Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.) Pub Date : 2022-06-12 DOI: 10.1145/3546930.3547497

Nadja Geisler, Carsten Binnig

引用次数: 0

Another way to implement complex computations: functional-style SQL UDF 实现复杂计算的另一种方法是:函数式SQL UDF

Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.) Pub Date : 2022-06-12 DOI: 10.1145/3546930.3547508

C. Duta

引用次数: 1

Exploratory training: when trainers learn 探索性培训:当培训师学习时

Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.) Pub Date : 2022-06-12 DOI: 10.1145/3546930.3547500

Omeed Habibelahian, R. Shrestha, Arash Termehchy, Paolo Papotti

{"title":"Exploratory training: when trainers learn","authors":"Omeed Habibelahian, R. Shrestha, Arash Termehchy, Paolo Papotti","doi":"10.1145/3546930.3547500","DOIUrl":"https://doi.org/10.1145/3546930.3547500","url":null,"abstract":"Data systems often present examples and solicit labels from users to learn a target concept in supervised to semi-supervised learning. This selection of examples could be even done in an active fashion i.e., active learning. Current systems assume that users always provide correct labeling with potentially a fixed and small chance of mistake. In several settings, users may have to explore and learn about the underlying data to label examples correctly, particularly for complex target concepts and models. For example, to provide accurate labeling for a model of detecting noisy or abnormal values, users might need to investigate the underlying data to understand typical and clean values in the data. As users gradually learn about the target concept and data, they may revise their labeling strategies. Due to the significance and non-stationarity of errors in this setting, current systems may use incorrect labels and learn inaccurate models from the users. We report preliminary results for a user study over real-world datasets on modeling human learning during training the system and layout the next steps in this investigation.","PeriodicalId":92279,"journal":{"name":"Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.)","volume":"30 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73301170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Context sight: model understanding and debugging via interpretable context 上下文视图:通过可解释的上下文来理解和调试模型

Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.) Pub Date : 2022-06-12 DOI: 10.1145/3546930.3547502

Jun Yuan, E. Bertini

{"title":"Context sight: model understanding and debugging via interpretable context","authors":"Jun Yuan, E. Bertini","doi":"10.1145/3546930.3547502","DOIUrl":"https://doi.org/10.1145/3546930.3547502","url":null,"abstract":"Model interpretation is increasingly important for successful model development and deployment. In recent years, many explanation methods are introduced to help humans understand how a machine learning model makes a decision on a specific instance. Recent studies show that contextualizing an individual model decision within a set of relevant examples can improve the model understanding. However, there is a lack of systematic study on what factors are considered when generating and using the context examples to explain model predictions, and how context examples help with model understanding and debugging in practice. In this work, we first identify a taxonomy of context generation and summarization through literature review. We then present Context Sight, a visual analytics system that integrates customized context generation and multiple-level context summarization to assist context exploration and interpretation. We evaluate the usefulness of the system through a detailed use case. This work is an initial step for a set of systematic research on how contextualization can help data scientists and practitioners understand and diagnose model behaviors, based on which we will gain a better understanding of the usage of context.","PeriodicalId":92279,"journal":{"name":"Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.)","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-06-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75467455","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Learning to Validate the Predictions of Black Box Machine Learning Models on Unseen Data 学习验证黑箱机器学习模型对未知数据的预测

Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.) Pub Date : 2019-07-05 DOI: 10.1145/3328519.3329126

S. Redyuk, Sebastian Schelter, Tammo Rukat, V. Markl, F. Biessmann

{"title":"Learning to Validate the Predictions of Black Box Machine Learning Models on Unseen Data","authors":"S. Redyuk, Sebastian Schelter, Tammo Rukat, V. Markl, F. Biessmann","doi":"10.1145/3328519.3329126","DOIUrl":"https://doi.org/10.1145/3328519.3329126","url":null,"abstract":"When end users apply a machine learning (ML) model on new unlabeled data, it is difficult for them to decide whether they can trust its predictions. Errors or shifts in the target data can lead to hard-to-detect drops in the predictive quality of the model. We therefore propose an approach to assist non-ML experts working with pretrained ML models. Our approach estimates the change in prediction performance of a model on unseen target data. It does not require explicit distributional assumptions on the dataset shift between the training and target data. Instead, a domain expert can declaratively specify typical cases of dataset shift that she expects to observe in real-world data. Based on this information, we learn a performance predictor for pretrained black box models, which can be combined with the model, and automatically warns end users in case of unexpected performance drops. We demonstrate the effectiveness of our approach on two models -- logistic regression and a neural network, applied to several real-world datasets.","PeriodicalId":92279,"journal":{"name":"Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.)","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72841143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

A Collaborative Framework for Structure Identification over Print Documents 基于打印文档的结构识别协同框架

Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.) Pub Date : 2019-07-05 DOI: 10.1145/3328519.3329131

Maeda F. Hanafi, M. Mannino, A. Abouzeid

引用次数: 0

Knowledge Graph Programming with a Human-in-the-Loop: Preliminary Results 知识图编程与人在循环:初步结果

Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.) Pub Date : 2019-07-05 DOI: 10.1145/3328519.3329132

Yuze Lou, Mahfus Uddin, Noam Brown, Michael J. Cafarella

引用次数: 2

Effective and Efficient Data Cleaning for Entity Matching 针对实体匹配的高效数据清洗

Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.) Pub Date : 2019-07-05 DOI: 10.1145/3328519.3329127

J. Ao, Rada Y. Chirkova

{"title":"Effective and Efficient Data Cleaning for Entity Matching","authors":"J. Ao, Rada Y. Chirkova","doi":"10.1145/3328519.3329127","DOIUrl":"https://doi.org/10.1145/3328519.3329127","url":null,"abstract":"As a key data-integration step, entity matching (EM) identifies tuples referring to the same real-world entities in disparate data sources. In many cases, the EM quality can be improved by repairing incorrect values in the data; at the same time, it is well known that the time costs of data cleaning by human experts could be prohibitive. In this paper, we focus on the time-consuming human-in-the-loop data-cleaning problem for relational EM, by recommending to human experts a time-efficient order in which values of attributes could be cleaned in the given data. Our proposed domain-independent cleaning framework aims to save human users' time, by guiding them in cleaning the EM inputs in an attribute order that is as conducive to maximizing EM accuracy as possible within a given constraint on the time they spend on cleaning. In guiding the cleaning process, our attribute-recommendation methods discover and take advantage of information provided by the data, and also use feedback from the EM engine. Our preliminary experimental results suggest that the proposed approach leads to measurable speedup, for a variety of time constraints, in the improvement of EM accuracy over the baseline approach, in which domain experts choose the sequence in which to clean the attributes of the inputs.","PeriodicalId":92279,"journal":{"name":"Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.)","volume":"73 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77712753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Session details: Data Cleaning and Entity Resolution 会话详细信息:数据清理和实体解析

Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.) Pub Date : 2019-07-05 DOI: 10.1145/3359610

Thibault Sellam

引用次数: 0

Interactive Summarization of Large Document Collections 大型文档集合的交互式摘要

Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.) Pub Date : 2019-07-05 DOI: 10.1145/3328519.3329129

Benjamin Hättasch, Christian M. Meyer, Carsten Binnig

引用次数: 1