解释原因:当标记文本数据时，说明和用户界面如何影响注释器的基本原理

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-04 DOI:10.48550/arXiv.2205.02005

Ankan Mullick, Sukannya Purkayastha, Pawan Goyal, Niloy Ganguly

{"title":"解释原因:当标记文本数据时，说明和用户界面如何影响注释器的基本原理","authors":"Ankan Mullick, Sukannya Purkayastha, Pawan Goyal, Niloy Ganguly","doi":"10.48550/arXiv.2205.02005","DOIUrl":null,"url":null,"abstract":"In the context of data labeling, NLP researchers are increasingly interested in having humans select rationales, a subset of input tokens relevant to the chosen label. We conducted a 332-participant online user study to understand how humans select rationales, especially how different instructions and user interface affordances impact the rationales chosen. Participants labeled ten movie reviews as positive or negative, selecting words and phrases supporting their label as rationales. We varied the instructions given, the rationale-selection task, and the user interface. Participants often selected about 12% of input tokens as rationales, but selected fewer if unable to drag over multiple tokens at once. Whereas participants were near unanimous in their data labels, they were far less consistent in their rationales. The user interface affordances and task greatly impacted the types of rationales chosen. We also observed large variance across participants.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"154 6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Explaining Why: How Instructions and User Interfaces Impact Annotator Rationales When Labeling Text Data\",\"authors\":\"Ankan Mullick, Sukannya Purkayastha, Pawan Goyal, Niloy Ganguly\",\"doi\":\"10.48550/arXiv.2205.02005\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the context of data labeling, NLP researchers are increasingly interested in having humans select rationales, a subset of input tokens relevant to the chosen label. We conducted a 332-participant online user study to understand how humans select rationales, especially how different instructions and user interface affordances impact the rationales chosen. Participants labeled ten movie reviews as positive or negative, selecting words and phrases supporting their label as rationales. We varied the instructions given, the rationale-selection task, and the user interface. Participants often selected about 12% of input tokens as rationales, but selected fewer if unable to drag over multiple tokens at once. Whereas participants were near unanimous in their data labels, they were far less consistent in their rationales. The user interface affordances and task greatly impacted the types of rationales chosen. We also observed large variance across participants.\",\"PeriodicalId\":382084,\"journal\":{\"name\":\"North American Chapter of the Association for Computational Linguistics\",\"volume\":\"154 6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"North American Chapter of the Association for Computational Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2205.02005\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"North American Chapter of the Association for Computational Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2205.02005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

在数据标记的背景下，NLP研究人员越来越感兴趣的是让人类选择基本原理，即与所选标签相关的输入令牌的子集。我们进行了一项332名参与者的在线用户研究，以了解人类如何选择基本原理，特别是不同的指令和用户界面功能如何影响基本原理的选择。参与者给十篇电影评论贴上正面或负面的标签，并选择支持这些标签的单词和短语作为基本依据。我们改变了给出的指令、理性选择任务和用户界面。参与者通常选择大约12%的输入标记作为基本参数，但如果不能一次拖动多个标记，则选择更少。尽管参与者在数据标签上几乎是一致的，但他们的基本原理却远没有那么一致。用户界面功能和任务极大地影响了所选择的基本原理类型。我们还观察到参与者之间存在很大差异。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Explaining Why: How Instructions and User Interfaces Impact Annotator Rationales When Labeling Text Data

In the context of data labeling, NLP researchers are increasingly interested in having humans select rationales, a subset of input tokens relevant to the chosen label. We conducted a 332-participant online user study to understand how humans select rationales, especially how different instructions and user interface affordances impact the rationales chosen. Participants labeled ten movie reviews as positive or negative, selecting words and phrases supporting their label as rationales. We varied the instructions given, the rationale-selection task, and the user interface. Participants often selected about 12% of input tokens as rationales, but selected fewer if unable to drag over multiple tokens at once. Whereas participants were near unanimous in their data labels, they were far less consistent in their rationales. The user interface affordances and task greatly impacted the types of rationales chosen. We also observed large variance across participants.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

North American Chapter of the Association for Computational Linguistics

自引率

0.00%

发文量