{"title":"Distributed collaborative machine learning in real-world application scenario: A white blood cell subtypes classification case study","authors":"Lorenzo Putzu , Simone Porcu , Andrea Loddo","doi":"10.1016/j.imavis.2025.105673","DOIUrl":null,"url":null,"abstract":"<div><div>White blood cell (WBC) subtype classification is a critical step in monitoring an individual’s health. However, it remains a challenging task due to the significant morphological variability of WBCs and the domain shift introduced by differing acquisition protocols across hospitals. Numerous approaches have been proposed to mitigate domain shift, including supervised and unsupervised domain adaptation, as well as domain generalisation. These methods, however, require a suitable amount of representative target images, even if unlabelled, or a suitable amount of images from multiple sources, which may not be feasible due to privacy regulations. In this study, we explore an alternative paradigm, known as <em>Distributed Collaborative Machine Learning</em> (DCML), which consists of exploiting images from different sources in a privacy-preserving setup. Although DCML methods seem well suited to this application, to the best of our knowledge, they have not been used for this task or to address the above-mentioned issues. However, we argue that DCML deserves further consideration in medical images as a potential alternative solution against domain shift in a privacy-preserving setup. To substantiate our view, we consider three DCML methods: early and late fusion and federated learning approaches, each offering distinct trade-offs in terms of training constraints, computational overhead and communications costs. We then conduct an extensive, cross-dataset experimental evaluation on four benchmark datasets and provide evidence that even <em>simple</em> implementations of DCML methods can effectively mitigate domain shift in WBC classification tasks.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"162 ","pages":"Article 105673"},"PeriodicalIF":4.2000,"publicationDate":"2025-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885625002616","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
White blood cell (WBC) subtype classification is a critical step in monitoring an individual’s health. However, it remains a challenging task due to the significant morphological variability of WBCs and the domain shift introduced by differing acquisition protocols across hospitals. Numerous approaches have been proposed to mitigate domain shift, including supervised and unsupervised domain adaptation, as well as domain generalisation. These methods, however, require a suitable amount of representative target images, even if unlabelled, or a suitable amount of images from multiple sources, which may not be feasible due to privacy regulations. In this study, we explore an alternative paradigm, known as Distributed Collaborative Machine Learning (DCML), which consists of exploiting images from different sources in a privacy-preserving setup. Although DCML methods seem well suited to this application, to the best of our knowledge, they have not been used for this task or to address the above-mentioned issues. However, we argue that DCML deserves further consideration in medical images as a potential alternative solution against domain shift in a privacy-preserving setup. To substantiate our view, we consider three DCML methods: early and late fusion and federated learning approaches, each offering distinct trade-offs in terms of training constraints, computational overhead and communications costs. We then conduct an extensive, cross-dataset experimental evaluation on four benchmark datasets and provide evidence that even simple implementations of DCML methods can effectively mitigate domain shift in WBC classification tasks.
期刊介绍:
Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.