Leveraging sparse annotations for leukemia diagnosis on the large leukemia dataset

IF 11.8 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Abdul Rehman , Talha Meraj , Aiman Mahmood Minhas , Ayisha Imran , Mohsen Ali , Waqas Sultani , Mubarak Shah
{"title":"Leveraging sparse annotations for leukemia diagnosis on the large leukemia dataset","authors":"Abdul Rehman ,&nbsp;Talha Meraj ,&nbsp;Aiman Mahmood Minhas ,&nbsp;Ayisha Imran ,&nbsp;Mohsen Ali ,&nbsp;Waqas Sultani ,&nbsp;Mubarak Shah","doi":"10.1016/j.media.2025.103760","DOIUrl":null,"url":null,"abstract":"<div><div>Leukemia is the 10th most frequently diagnosed cancer and one of the leading causes of cancer-related deaths worldwide. Realistic analysis of leukemia requires white blood cell (WBC) localization, classification, and morphological assessment. Despite deep learning advances in medical imaging, leukemia analysis lacks a large, diverse multi-task dataset, while existing small datasets lack domain diversity, limiting real-world applicability. To overcome dataset challenges, we present a large-scale WBC dataset named ‘Large Leukemia Dataset’ (LLD) and novel methods for detecting WBC with their attributes. Our contribution here is threefold. First, we present a large-scale Leukemia dataset collected through Peripheral Blood Films (PBF) from 48 patients, through multiple microscopes, multi-cameras, and multi-magnification. To enhance diagnosis explainability and medical expert acceptance, each leukemia cell is annotated at 100x with 7 morphological attributes, ranging from Cell Size to Nuclear Shape. Secondly, we propose a multi-task model that not only detects WBCs but also predicts their attributes, providing an interpretable and clinically meaningful solution. Third, we propose a method for WBC detection with attribute analysis using sparse annotations. This approach reduces the annotation burden on hematologists, requiring them to mark only a small area within the field of view. Our method enables the model to leverage the entire field of view rather than just the annotated regions, enhancing learning efficiency and diagnostic accuracy. From diagnosis explainability to overcoming domain-shift challenges, the presented datasets can be used for many challenging aspects of microscopic image analysis. The datasets, code, and demo are available at: <span><span>https://im.itu.edu.pk/sparse-leukemiaattri/</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"107 ","pages":"Article 103760"},"PeriodicalIF":11.8000,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841525003068","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Leukemia is the 10th most frequently diagnosed cancer and one of the leading causes of cancer-related deaths worldwide. Realistic analysis of leukemia requires white blood cell (WBC) localization, classification, and morphological assessment. Despite deep learning advances in medical imaging, leukemia analysis lacks a large, diverse multi-task dataset, while existing small datasets lack domain diversity, limiting real-world applicability. To overcome dataset challenges, we present a large-scale WBC dataset named ‘Large Leukemia Dataset’ (LLD) and novel methods for detecting WBC with their attributes. Our contribution here is threefold. First, we present a large-scale Leukemia dataset collected through Peripheral Blood Films (PBF) from 48 patients, through multiple microscopes, multi-cameras, and multi-magnification. To enhance diagnosis explainability and medical expert acceptance, each leukemia cell is annotated at 100x with 7 morphological attributes, ranging from Cell Size to Nuclear Shape. Secondly, we propose a multi-task model that not only detects WBCs but also predicts their attributes, providing an interpretable and clinically meaningful solution. Third, we propose a method for WBC detection with attribute analysis using sparse annotations. This approach reduces the annotation burden on hematologists, requiring them to mark only a small area within the field of view. Our method enables the model to leverage the entire field of view rather than just the annotated regions, enhancing learning efficiency and diagnostic accuracy. From diagnosis explainability to overcoming domain-shift challenges, the presented datasets can be used for many challenging aspects of microscopic image analysis. The datasets, code, and demo are available at: https://im.itu.edu.pk/sparse-leukemiaattri/.
利用稀疏注释对大型白血病数据集进行白血病诊断
白血病是第十大最常诊断的癌症,也是全球癌症相关死亡的主要原因之一。白血病的现实分析需要白细胞(WBC)定位、分类和形态学评估。尽管深度学习在医学成像方面取得了进步,但白血病分析缺乏大型、多样化的多任务数据集,而现有的小数据集缺乏领域多样性,限制了现实世界的适用性。为了克服数据集的挑战,我们提出了一个名为“大型白血病数据集”(LLD)的大规模白细胞数据集,并提出了检测白细胞及其属性的新方法。我们在这里的贡献是三重的。首先,我们通过多台显微镜、多台相机和多倍放大镜,通过外周血膜(PBF)收集了48名患者的大规模白血病数据集。为了提高诊断的可解释性和医学专家的接受度,每个白血病细胞被标注为100倍,具有7个形态学属性,从细胞大小到核形状。其次,我们提出了一个多任务模型,不仅可以检测白细胞,还可以预测其属性,提供了一个可解释的和有临床意义的解决方案。第三,我们提出了一种基于稀疏注释的属性分析WBC检测方法。这种方法减少了血液学家的注释负担,只需要他们标记视野内的一小块区域。我们的方法使模型能够利用整个视野,而不仅仅是注释区域,提高了学习效率和诊断准确性。从诊断可解释性到克服域移位挑战,所提出的数据集可用于显微图像分析的许多具有挑战性的方面。数据集、代码和演示可以在https://im.itu.edu.pk/sparse-leukemiaattri/上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Medical image analysis
Medical image analysis 工程技术-工程:生物医学
CiteScore
22.10
自引率
6.40%
发文量
309
审稿时长
6.6 months
期刊介绍: Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信