HSALC: hard sample aware label correction for medical image classification

IF 3 4区 计算机科学 Q2 COMPUTER SCIENCE, INFORMATION SYSTEMS
Yangtao Wang, Yicheng Ye, Yanzhao Xie, Maobin Tang, Lisheng Fan
{"title":"HSALC: hard sample aware label correction for medical image classification","authors":"Yangtao Wang, Yicheng Ye, Yanzhao Xie, Maobin Tang, Lisheng Fan","doi":"10.1007/s11042-024-20114-0","DOIUrl":null,"url":null,"abstract":"<p>Medical image automatic classification has always been a research hotspot, but the existing methods suffer from the label noise problem, which either discards those samples with noisy labels or produces wrong label correction, seriously preventing the medical image classification performance improvement. To address the above problems, in this paper, we propose a hard sample aware label correction (termed as HSALC) method for medical image classification. Our HSALC mainly consists of a sample division module, a clean<span>\\(\\cdot \\)</span>hard<span>\\(\\cdot \\)</span>noisy (termed as CHN) detection module and a label noise correction module. First, in the sample division module, we design a sample division criterion based on the training difficulty and training losses to divide all samples into three preliminary subsets: clean samples, hard samples and noisy samples. Second, in the CHN detection module, we add noise to the above clean samples and repeatedly adopt the sample division criterion to effectively detect all data, which helps obtain highly reliable clean samples, hard samples and noisy samples. Finally, in the label noise correction module, in order to make full use of each available sample, we train a correction model to purify and correct the wrong labels of noisy samples as much as possible, which brings a highly purified dataset. We conduct extensive experiments on five image datasets including three medical image datasets and two natural image datasets. Experimental results demonstrate that HSALC can greatly promote classification performance on noisily labeled datasets, especially with high noise ratios. The source code of this paper is publicly available at GitHub: https://github.com/YYC117/HSALC.</p>","PeriodicalId":18770,"journal":{"name":"Multimedia Tools and Applications","volume":null,"pages":null},"PeriodicalIF":3.0000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Multimedia Tools and Applications","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11042-024-20114-0","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Medical image automatic classification has always been a research hotspot, but the existing methods suffer from the label noise problem, which either discards those samples with noisy labels or produces wrong label correction, seriously preventing the medical image classification performance improvement. To address the above problems, in this paper, we propose a hard sample aware label correction (termed as HSALC) method for medical image classification. Our HSALC mainly consists of a sample division module, a clean\(\cdot \)hard\(\cdot \)noisy (termed as CHN) detection module and a label noise correction module. First, in the sample division module, we design a sample division criterion based on the training difficulty and training losses to divide all samples into three preliminary subsets: clean samples, hard samples and noisy samples. Second, in the CHN detection module, we add noise to the above clean samples and repeatedly adopt the sample division criterion to effectively detect all data, which helps obtain highly reliable clean samples, hard samples and noisy samples. Finally, in the label noise correction module, in order to make full use of each available sample, we train a correction model to purify and correct the wrong labels of noisy samples as much as possible, which brings a highly purified dataset. We conduct extensive experiments on five image datasets including three medical image datasets and two natural image datasets. Experimental results demonstrate that HSALC can greatly promote classification performance on noisily labeled datasets, especially with high noise ratios. The source code of this paper is publicly available at GitHub: https://github.com/YYC117/HSALC.

Abstract Image

HSALC:用于医学图像分类的硬样本感知标签校正
医学影像自动分类一直是研究热点,但现有方法存在标签噪声问题,要么丢弃有噪声标签的样本,要么产生错误的标签校正,严重阻碍了医学影像分类性能的提高。针对上述问题,本文提出了一种用于医学图像分类的硬样本感知标签校正方法(简称 HSALC)。HSALC主要由样本划分模块、噪声检测模块和标签噪声校正模块组成。首先,在样本划分模块中,我们设计了一个基于训练难度和训练损失的样本划分准则,将所有样本初步划分为三个子集:干净样本、困难样本和噪声样本。其次,在 CHN 检测模块中,我们在上述干净样本中加入噪声,并反复采用样本划分准则对所有数据进行有效检测,这有助于获得高可靠性的干净样本、硬样本和噪声样本。最后,在标签噪声校正模块中,为了充分利用每一个可用样本,我们训练了一个校正模型,以尽可能净化和校正噪声样本的错误标签,从而带来一个高度纯化的数据集。我们在五个图像数据集上进行了大量实验,包括三个医学图像数据集和两个自然图像数据集。实验结果表明,HSALC 可以大大提高噪声标签数据集的分类性能,尤其是在高噪声比的情况下。本文的源代码可在 GitHub 上公开获取:https://github.com/YYC117/HSALC。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Multimedia Tools and Applications
Multimedia Tools and Applications 工程技术-工程:电子与电气
CiteScore
7.20
自引率
16.70%
发文量
2439
审稿时长
9.2 months
期刊介绍: Multimedia Tools and Applications publishes original research articles on multimedia development and system support tools as well as case studies of multimedia applications. It also features experimental and survey articles. The journal is intended for academics, practitioners, scientists and engineers who are involved in multimedia system research, design and applications. All papers are peer reviewed. Specific areas of interest include: - Multimedia Tools: - Multimedia Applications: - Prototype multimedia systems and platforms
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信