Improved label noise identification by exploiting unlabeled data

Hongqiang Wei, Qi Zhu, D. Guan, Weiwei Yuan, A. Khattak, Francis Chow
{"title":"Improved label noise identification by exploiting unlabeled data","authors":"Hongqiang Wei, Qi Zhu, D. Guan, Weiwei Yuan, A. Khattak, Francis Chow","doi":"10.1109/SPAC.2017.8304291","DOIUrl":null,"url":null,"abstract":"In machine learning, the available training samples are not always perfect and some labels can be corrupted which are called label noises. This may cause the reduction of accuracy. Meanwhile it will also increase the complexity of model. To mitigate the detrimental effect of label noises, noise filtering has been widely used which tries to identify label noises and remove them prior to learning. Almost all existing works only focus on the mislabeled training dataset and ignore the existence of unlabeled data. In fact, unlabeled data are easily accessible in many applications. In this work, we explore how to utilize these unlabeled data to increase the noise filtering effect. To this end, we have proposed a method named MFUDCM (Multiple Filtering with the aid of Unlabeled Data using Confidence Measurement). This method applies the novel multiple soft majority voting idea to make use unlabeled data. In addition, MFUDCM is expected to have a higher accuracy of identifying mislabeled data by using the concept of multiple voting. Finally, the validity of the proposed method MFUDCM is confirmed by experiments and the comparison results with other methods.","PeriodicalId":161647,"journal":{"name":"2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPAC.2017.8304291","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

In machine learning, the available training samples are not always perfect and some labels can be corrupted which are called label noises. This may cause the reduction of accuracy. Meanwhile it will also increase the complexity of model. To mitigate the detrimental effect of label noises, noise filtering has been widely used which tries to identify label noises and remove them prior to learning. Almost all existing works only focus on the mislabeled training dataset and ignore the existence of unlabeled data. In fact, unlabeled data are easily accessible in many applications. In this work, we explore how to utilize these unlabeled data to increase the noise filtering effect. To this end, we have proposed a method named MFUDCM (Multiple Filtering with the aid of Unlabeled Data using Confidence Measurement). This method applies the novel multiple soft majority voting idea to make use unlabeled data. In addition, MFUDCM is expected to have a higher accuracy of identifying mislabeled data by using the concept of multiple voting. Finally, the validity of the proposed method MFUDCM is confirmed by experiments and the comparison results with other methods.
利用未标记数据改进标签噪声识别
在机器学习中,可用的训练样本并不总是完美的,一些标签可能会被破坏,这些被称为标签噪声。这可能会导致准确性的降低。同时也会增加模型的复杂性。为了减轻标签噪声的不利影响,噪声滤波被广泛使用,它试图识别标签噪声并在学习之前将其去除。几乎所有现有的工作都只关注错误标记的训练数据集,而忽略了未标记数据的存在。事实上,未标记的数据在许多应用程序中都很容易访问。在这项工作中,我们探索如何利用这些未标记的数据来提高噪声过滤效果。为此,我们提出了一种名为MFUDCM (Multiple Filtering with aid of Unlabeled Data using Confidence Measurement)的方法。该方法采用了新颖的多重软多数投票思想来利用未标记数据。此外,通过使用多重投票的概念,期望MFUDCM在识别错误标记数据方面具有更高的准确性。最后,通过实验和与其他方法的比较,验证了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信