观察数据科学家:使用人工修正作为隐式反馈

Nurzety A. Azuan, Suzanne M. Embury, N. Paton
{"title":"观察数据科学家:使用人工修正作为隐式反馈","authors":"Nurzety A. Azuan, Suzanne M. Embury, N. Paton","doi":"10.1145/3077257.3077272","DOIUrl":null,"url":null,"abstract":"Dataspaces aim to remove the up-front costs of information integration by gathering the needed domain information through targeted interactions with the end-user throughout the life-time of the integration. State-of-the-art tools are used to rapidly construct an initial (incorrect) integration, which is then refined in a pay-as-you-go manner by asking end-users to supply feedback on the resulting data. The idea is that end-users will choose to put effort into providing feedback on the areas of the integration where the quality is important to them, while other less well-used areas will receive a smaller share of user attention. This approach is promising but open problems remain. One issue is that the end-user loses control over the process. Their contribution is to specify their query requirements and to provide feedback on the results, as directed by the dataspace. But what feedback should the user supply to get the data they want? We propose a new approach to data integration in which the end-user and the dataspace work as equal partners to meet the integration goal. Both are able to perform data integration tasks directly, and both request and provide feedback on the results. In addition, the dataspace observes the actions of the end-user when carrying out integration, with the aim of automating that part of the work in future integration tasks. In this paper, we explore this idea by examining how a dataspace can observe an end-user at work, correcting errors in query results, to gather feedback needed to refine the mappings used for integration. We propose an algorithm for converting manual corrections to feedback, and present the results of a preliminary evaluation comparing this approach with seeking explicit feedback from end-users.","PeriodicalId":92279,"journal":{"name":"Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.)","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2017-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Observing the Data Scientist: Using Manual Corrections As Implicit Feedback\",\"authors\":\"Nurzety A. Azuan, Suzanne M. Embury, N. Paton\",\"doi\":\"10.1145/3077257.3077272\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dataspaces aim to remove the up-front costs of information integration by gathering the needed domain information through targeted interactions with the end-user throughout the life-time of the integration. State-of-the-art tools are used to rapidly construct an initial (incorrect) integration, which is then refined in a pay-as-you-go manner by asking end-users to supply feedback on the resulting data. The idea is that end-users will choose to put effort into providing feedback on the areas of the integration where the quality is important to them, while other less well-used areas will receive a smaller share of user attention. This approach is promising but open problems remain. One issue is that the end-user loses control over the process. Their contribution is to specify their query requirements and to provide feedback on the results, as directed by the dataspace. But what feedback should the user supply to get the data they want? We propose a new approach to data integration in which the end-user and the dataspace work as equal partners to meet the integration goal. Both are able to perform data integration tasks directly, and both request and provide feedback on the results. In addition, the dataspace observes the actions of the end-user when carrying out integration, with the aim of automating that part of the work in future integration tasks. In this paper, we explore this idea by examining how a dataspace can observe an end-user at work, correcting errors in query results, to gather feedback needed to refine the mappings used for integration. We propose an algorithm for converting manual corrections to feedback, and present the results of a preliminary evaluation comparing this approach with seeking explicit feedback from end-users.\",\"PeriodicalId\":92279,\"journal\":{\"name\":\"Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.)\",\"volume\":\"6 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-05-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3077257.3077272\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics. Workshop on Human-In-the-Loop Data Analytics (2nd : 2017 : Chicago, Ill.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3077257.3077272","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

摘要

数据空间旨在通过在集成的整个生命周期中与最终用户进行有针对性的交互来收集所需的领域信息,从而消除信息集成的前期成本。最先进的工具用于快速构建初始(不正确的)集成,然后通过要求最终用户提供对结果数据的反馈,以按需付费的方式对其进行改进。其理念是,终端用户将选择在质量对他们很重要的集成领域投入精力提供反馈,而其他不太常用的领域将获得较少的用户关注。这种方法很有希望,但仍存在一些悬而未决的问题。一个问题是最终用户失去了对流程的控制。它们的作用是指定查询需求,并根据数据空间的指示提供对结果的反馈。但是用户应该提供什么样的反馈来获得他们想要的数据呢?我们提出了一种新的数据集成方法,在这种方法中,最终用户和数据空间作为平等的合作伙伴来实现集成目标。两者都能够直接执行数据集成任务,并且都请求并提供对结果的反馈。此外,数据空间在执行集成时观察最终用户的操作,目的是在未来的集成任务中自动化这部分工作。在本文中,我们通过研究数据空间如何观察工作中的最终用户、纠正查询结果中的错误、收集改进用于集成的映射所需的反馈,来探索这个想法。我们提出了一种将手动修正转换为反馈的算法,并提出了初步评估的结果,将这种方法与寻求最终用户的明确反馈进行比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Observing the Data Scientist: Using Manual Corrections As Implicit Feedback
Dataspaces aim to remove the up-front costs of information integration by gathering the needed domain information through targeted interactions with the end-user throughout the life-time of the integration. State-of-the-art tools are used to rapidly construct an initial (incorrect) integration, which is then refined in a pay-as-you-go manner by asking end-users to supply feedback on the resulting data. The idea is that end-users will choose to put effort into providing feedback on the areas of the integration where the quality is important to them, while other less well-used areas will receive a smaller share of user attention. This approach is promising but open problems remain. One issue is that the end-user loses control over the process. Their contribution is to specify their query requirements and to provide feedback on the results, as directed by the dataspace. But what feedback should the user supply to get the data they want? We propose a new approach to data integration in which the end-user and the dataspace work as equal partners to meet the integration goal. Both are able to perform data integration tasks directly, and both request and provide feedback on the results. In addition, the dataspace observes the actions of the end-user when carrying out integration, with the aim of automating that part of the work in future integration tasks. In this paper, we explore this idea by examining how a dataspace can observe an end-user at work, correcting errors in query results, to gather feedback needed to refine the mappings used for integration. We propose an algorithm for converting manual corrections to feedback, and present the results of a preliminary evaluation comparing this approach with seeking explicit feedback from end-users.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信