Leveraging Human-Machine Interactions for Computer Vision Dataset Quality Enhancement

International Conference on Intelligent Human Computer Interaction Pub Date : 2024-01-31 DOI:10.48550/arXiv.2401.17736

Esla Timothy Anzaku, Hyesoo Hong, Jin-Woo Park, Wonjun Yang, Kangmin Kim, Jongbum Won, Deshika Vinoshani Kumari Herath, Arnout Van Messem, W. D. Neve

{"title":"Leveraging Human-Machine Interactions for Computer Vision Dataset Quality Enhancement","authors":"Esla Timothy Anzaku, Hyesoo Hong, Jin-Woo Park, Wonjun Yang, Kangmin Kim, Jongbum Won, Deshika Vinoshani Kumari Herath, Arnout Van Messem, W. D. Neve","doi":"10.48550/arXiv.2401.17736","DOIUrl":null,"url":null,"abstract":"Large-scale datasets for single-label multi-class classification, such as \\emph{ImageNet-1k}, have been instrumental in advancing deep learning and computer vision. However, a critical and often understudied aspect is the comprehensive quality assessment of these datasets, especially regarding potential multi-label annotation errors. In this paper, we introduce a lightweight, user-friendly, and scalable framework that synergizes human and machine intelligence for efficient dataset validation and quality enhancement. We term this novel framework \\emph{Multilabelfy}. Central to Multilabelfy is an adaptable web-based platform that systematically guides annotators through the re-evaluation process, effectively leveraging human-machine interactions to enhance dataset quality. By using Multilabelfy on the ImageNetV2 dataset, we found that approximately $47.88\\%$ of the images contained at least two labels, underscoring the need for more rigorous assessments of such influential datasets. Furthermore, our analysis showed a negative correlation between the number of potential labels per image and model top-1 accuracy, illuminating a crucial factor in model evaluation and selection. Our open-source framework, Multilabelfy, offers a convenient, lightweight solution for dataset enhancement, emphasizing multi-label proportions. This study tackles major challenges in dataset integrity and provides key insights into model performance evaluation. Moreover, it underscores the advantages of integrating human expertise with machine capabilities to produce more robust models and trustworthy data development. The source code for Multilabelfy will be available at https://github.com/esla/Multilabelfy. \\keywords{Computer Vision \\and Dataset Quality Enhancement \\and Dataset Validation \\and Human-Computer Interaction \\and Multi-label Annotation.}","PeriodicalId":224881,"journal":{"name":"International Conference on Intelligent Human Computer Interaction","volume":"812 ","pages":"295-309"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Intelligent Human Computer Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2401.17736","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Large-scale datasets for single-label multi-class classification, such as \emph{ImageNet-1k}, have been instrumental in advancing deep learning and computer vision. However, a critical and often understudied aspect is the comprehensive quality assessment of these datasets, especially regarding potential multi-label annotation errors. In this paper, we introduce a lightweight, user-friendly, and scalable framework that synergizes human and machine intelligence for efficient dataset validation and quality enhancement. We term this novel framework \emph{Multilabelfy}. Central to Multilabelfy is an adaptable web-based platform that systematically guides annotators through the re-evaluation process, effectively leveraging human-machine interactions to enhance dataset quality. By using Multilabelfy on the ImageNetV2 dataset, we found that approximately $47.88\%$ of the images contained at least two labels, underscoring the need for more rigorous assessments of such influential datasets. Furthermore, our analysis showed a negative correlation between the number of potential labels per image and model top-1 accuracy, illuminating a crucial factor in model evaluation and selection. Our open-source framework, Multilabelfy, offers a convenient, lightweight solution for dataset enhancement, emphasizing multi-label proportions. This study tackles major challenges in dataset integrity and provides key insights into model performance evaluation. Moreover, it underscores the advantages of integrating human expertise with machine capabilities to produce more robust models and trustworthy data development. The source code for Multilabelfy will be available at https://github.com/esla/Multilabelfy. \keywords{Computer Vision \and Dataset Quality Enhancement \and Dataset Validation \and Human-Computer Interaction \and Multi-label Annotation.}

查看原文本刊更多论文

利用人机交互提高计算机视觉数据集质量

用于单标签多类分类的大规模数据集（如 \emph{ImageNet-1k}）在推进深度学习和计算机视觉方面发挥了重要作用。然而，这些数据集的综合质量评估，尤其是潜在的多标签标注错误的评估，是一个至关重要且往往未得到充分研究的方面。在本文中，我们介绍了一个轻量级、用户友好且可扩展的框架，该框架可协同人类和机器智能，实现高效的数据集验证和质量提升。我们将这一新型框架称为 Multilabelfy。Multilabelfy的核心是一个适应性强的基于网络的平台，它能系统地指导注释者完成重新评估过程，有效利用人机交互来提高数据集质量。通过在ImageNetV2数据集上使用Multilabelfy，我们发现大约47.88%的图像包含至少两个标签，这突出表明需要对此类有影响力的数据集进行更严格的评估。此外，我们的分析表明，每幅图像的潜在标签数量与模型的top-1准确率之间存在负相关关系，这揭示了模型评估和选择中的一个关键因素。我们的开源框架 Multilabelfy 为数据集增强提供了便捷、轻量级的解决方案，强调多标签比例。这项研究解决了数据集完整性方面的主要挑战，并为模型性能评估提供了重要见解。此外，它还强调了将人类专业知识与机器能力相结合的优势，以产生更强大的模型和值得信赖的数据开发。Multilabelfy的源代码将在https://github.com/esla/Multilabelfy。\keywords{Computer Vision（计算机视觉）\and Dataset Quality Enhancement（数据集质量增强）\and Dataset Validation（数据集验证）\and Human-Computer Interaction（人机交互）\and Multi-label Annotation.}。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Conference on Intelligent Human Computer Interaction

自引率

0.00%

发文量