Ichraq Lemghari, Sylvie Le Hégarat-Mascle, Emanuel Aldea, Jennifer Vandoni
{"title":"Robust classification with noisy labels using Venn–Abers predictors","authors":"Ichraq Lemghari, Sylvie Le Hégarat-Mascle, Emanuel Aldea, Jennifer Vandoni","doi":"10.1117/1.jei.33.3.031210","DOIUrl":null,"url":null,"abstract":"The advent of deep learning methods has led to impressive advances in computer vision tasks over the past decades, largely due to their ability to extract non-linear features that are well adapted to the task at hand. For supervised approaches, data labeling is essential to achieve a high level of performance; however, this task can be so fastidious or even troublesome in difficult contexts (e.g., specific defect detection, unconventional data annotations, etc.) that experts can sometimes erroneously provide the wrong ground truth label. Considering classification problems, this paper addresses the issue of handling noisy labels in datasets. Specifically, we first detect the noisy samples of a dataset using set-valued labels and then improve their classification using Venn–Abers predictors. The obtained results reach more than 0.99 and 0.90 accuracy for noisified versions of two widely used image classification datasets, digit MNIST and CIFAR-10 respectively with a 40% two-class pair-flip noise ratio and 0.87 accuracy for CIFAR-10 with 10-class uniform 40% noise ratio.","PeriodicalId":54843,"journal":{"name":"Journal of Electronic Imaging","volume":"12 1","pages":""},"PeriodicalIF":1.0000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Electronic Imaging","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1117/1.jei.33.3.031210","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
The advent of deep learning methods has led to impressive advances in computer vision tasks over the past decades, largely due to their ability to extract non-linear features that are well adapted to the task at hand. For supervised approaches, data labeling is essential to achieve a high level of performance; however, this task can be so fastidious or even troublesome in difficult contexts (e.g., specific defect detection, unconventional data annotations, etc.) that experts can sometimes erroneously provide the wrong ground truth label. Considering classification problems, this paper addresses the issue of handling noisy labels in datasets. Specifically, we first detect the noisy samples of a dataset using set-valued labels and then improve their classification using Venn–Abers predictors. The obtained results reach more than 0.99 and 0.90 accuracy for noisified versions of two widely used image classification datasets, digit MNIST and CIFAR-10 respectively with a 40% two-class pair-flip noise ratio and 0.87 accuracy for CIFAR-10 with 10-class uniform 40% noise ratio.
期刊介绍:
The Journal of Electronic Imaging publishes peer-reviewed papers in all technology areas that make up the field of electronic imaging and are normally considered in the design, engineering, and applications of electronic imaging systems.