Optimizing Nugget Annotations with Active Learning

G. Baruah, Haotian Zhang, Rakesh Guttikonda, Jimmy J. Lin, Mark D. Smucker, Olga Vechtomova
{"title":"Optimizing Nugget Annotations with Active Learning","authors":"G. Baruah, Haotian Zhang, Rakesh Guttikonda, Jimmy J. Lin, Mark D. Smucker, Olga Vechtomova","doi":"10.1145/2983323.2983694","DOIUrl":null,"url":null,"abstract":"Nugget-based evaluations, such as those deployed in the TREC Temporal Summarization and Question Answering tracks, require human assessors to determine whether a nugget is present in a given piece of text. This process, known as nugget annotation, is labor-intensive. In this paper, we present two active learning techniques that prioritize the sequence in which candidate nugget/sentence pairs are presented to an assessor, based on the likelihood that the sentence contains a nugget. Our approach builds on the recognition that nugget annotation is similar to high-recall retrieval, and we adapt proven existing solutions. Simulation experiments with four existing TREC test collections show that our techniques yield far more matches for a given level of effort than baselines that are typically deployed in previous nugget-based evaluations.","PeriodicalId":250808,"journal":{"name":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 25th ACM International on Conference on Information and Knowledge Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2983323.2983694","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

Nugget-based evaluations, such as those deployed in the TREC Temporal Summarization and Question Answering tracks, require human assessors to determine whether a nugget is present in a given piece of text. This process, known as nugget annotation, is labor-intensive. In this paper, we present two active learning techniques that prioritize the sequence in which candidate nugget/sentence pairs are presented to an assessor, based on the likelihood that the sentence contains a nugget. Our approach builds on the recognition that nugget annotation is similar to high-recall retrieval, and we adapt proven existing solutions. Simulation experiments with four existing TREC test collections show that our techniques yield far more matches for a given level of effort than baselines that are typically deployed in previous nugget-based evaluations.
优化Nugget标注与主动学习
基于块的评估,例如那些部署在TREC时间摘要和问答轨道中的评估,需要人类评估人员确定块是否存在于给定的文本中。这个过程称为块注释,是劳动密集型的。在本文中,我们提出了两种主动学习技术,基于句子包含块的可能性,将候选块/句子对呈现给评估者的顺序进行优先排序。我们的方法建立在金块注释类似于高查全率检索的认识之上,我们采用了经过验证的现有解决方案。对四个现有TREC测试集合的模拟实验表明,我们的技术在给定的努力水平下产生的匹配量远远超过以前基于金块的评估中通常部署的基线。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信