观察数据科学家:使用人工修正作为隐式反馈

Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.) Pub Date : 2017-05-14 DOI:10.1145/3077257.3077272

Nurzety A. Azuan, Suzanne M. Embury, N. Paton

{"title":"观察数据科学家:使用人工修正作为隐式反馈","authors":"Nurzety A. Azuan, Suzanne M. Embury, N. Paton","doi":"10.1145/3077257.3077272","DOIUrl":null,"url":null,"abstract":"Dataspaces aim to remove the up-front costs of information integration by gathering the needed domain information through targeted interactions with the end-user throughout the life-time of the integration. State-of-the-art tools are used to rapidly construct an initial (incorrect) integration, which is then refined in a pay-as-you-go manner by asking end-users to supply feedback on the resulting data. The idea is that end-users will choose to put effort into providing feedback on the areas of the integration where the quality is important to them, while other less well-used areas will receive a smaller share of user attention. This approach is promising but open problems remain. One issue is that the end-user loses control over the process. Their contribution is to specify their query requirements and to provide feedback on the results, as directed by the dataspace. But what feedback should the user supply to get the data they want? We propose a new approach to data integration in which the end-user and the dataspace work as equal partners to meet the integration goal. Both are able to perform data integration tasks directly, and both request and provide feedback on the results. In addition, the dataspace observes the actions of the end-user when carrying out integration, with the aim of automating that part of the work in future integration tasks. In this paper, we explore this idea by examining how a dataspace can observe an end-user at work, correcting errors in query results, to gather feedback needed to refine the mappings used for integration. We propose an algorithm for converting manual corrections to feedback, and present the results of a preliminary evaluation comparing this approach with seeking explicit feedback from end-users.","PeriodicalId":92279,"journal":{"name":"Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.)","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Observing the Data Scientist: Using Manual Corrections As Implicit Feedback\",\"authors\":\"Nurzety A. Azuan, Suzanne M. Embury, N. Paton\",\"doi\":\"10.1145/3077257.3077272\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dataspaces aim to remove the up-front costs of information integration by gathering the needed domain information through targeted interactions with the end-user throughout the life-time of the integration. State-of-the-art tools are used to rapidly construct an initial (incorrect) integration, which is then refined in a pay-as-you-go manner by asking end-users to supply feedback on the resulting data. The idea is that end-users will choose to put effort into providing feedback on the areas of the integration where the quality is important to them, while other less well-used areas will receive a smaller share of user attention. This approach is promising but open problems remain. One issue is that the end-user loses control over the process. Their contribution is to specify their query requirements and to provide feedback on the results, as directed by the dataspace. But what feedback should the user supply to get the data they want? We propose a new approach to data integration in which the end-user and the dataspace work as equal partners to meet the integration goal. Both are able to perform data integration tasks directly, and both request and provide feedback on the results. In addition, the dataspace observes the actions of the end-user when carrying out integration, with the aim of automating that part of the work in future integration tasks. In this paper, we explore this idea by examining how a dataspace can observe an end-user at work, correcting errors in query results, to gather feedback needed to refine the mappings used for integration. We propose an algorithm for converting manual corrections to feedback, and present the results of a preliminary evaluation comparing this approach with seeking explicit feedback from end-users.\",\"PeriodicalId\":92279,\"journal\":{\"name\":\"Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.)\",\"volume\":\"6 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3077257.3077272\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3077257.3077272","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

数据空间旨在通过在集成的整个生命周期中与最终用户进行有针对性的交互来收集所需的领域信息，从而消除信息集成的前期成本。最先进的工具用于快速构建初始(不正确的)集成，然后通过要求最终用户提供对结果数据的反馈，以按需付费的方式对其进行改进。其理念是，终端用户将选择在质量对他们很重要的集成领域投入精力提供反馈，而其他不太常用的领域将获得较少的用户关注。这种方法很有希望，但仍存在一些悬而未决的问题。一个问题是最终用户失去了对流程的控制。它们的作用是指定查询需求，并根据数据空间的指示提供对结果的反馈。但是用户应该提供什么样的反馈来获得他们想要的数据呢?我们提出了一种新的数据集成方法，在这种方法中，最终用户和数据空间作为平等的合作伙伴来实现集成目标。两者都能够直接执行数据集成任务，并且都请求并提供对结果的反馈。此外，数据空间在执行集成时观察最终用户的操作，目的是在未来的集成任务中自动化这部分工作。在本文中，我们通过研究数据空间如何观察工作中的最终用户、纠正查询结果中的错误、收集改进用于集成的映射所需的反馈，来探索这个想法。我们提出了一种将手动修正转换为反馈的算法，并提出了初步评估的结果，将这种方法与寻求最终用户的明确反馈进行比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Observing the Data Scientist: Using Manual Corrections As Implicit Feedback

Dataspaces aim to remove the up-front costs of information integration by gathering the needed domain information through targeted interactions with the end-user throughout the life-time of the integration. State-of-the-art tools are used to rapidly construct an initial (incorrect) integration, which is then refined in a pay-as-you-go manner by asking end-users to supply feedback on the resulting data. The idea is that end-users will choose to put effort into providing feedback on the areas of the integration where the quality is important to them, while other less well-used areas will receive a smaller share of user attention. This approach is promising but open problems remain. One issue is that the end-user loses control over the process. Their contribution is to specify their query requirements and to provide feedback on the results, as directed by the dataspace. But what feedback should the user supply to get the data they want? We propose a new approach to data integration in which the end-user and the dataspace work as equal partners to meet the integration goal. Both are able to perform data integration tasks directly, and both request and provide feedback on the results. In addition, the dataspace observes the actions of the end-user when carrying out integration, with the aim of automating that part of the work in future integration tasks. In this paper, we explore this idea by examining how a dataspace can observe an end-user at work, correcting errors in query results, to gather feedback needed to refine the mappings used for integration. We propose an algorithm for converting manual corrections to feedback, and present the results of a preliminary evaluation comparing this approach with seeking explicit feedback from end-users.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.)

自引率

0.00%

发文量