{"title":"通过OOD过滤和自集成增强跨区域白细胞分类的测试时间。","authors":"Lorenzo Putzu, Andrea Loddo, Cecilia Di Ruberto","doi":"10.3390/jimaging11090295","DOIUrl":null,"url":null,"abstract":"<p><p>Domain shift poses a major challenge in many Machine Learning applications due to variations in data acquisition protocols, particularly in the medical field. Test-time augmentation (TTA) can solve the domain shift issue and improve robustness by aggregating predictions from multiple augmented versions of the same input. However, TTA may inadvertently generate unrealistic or Out-of-Distribution (OOD) samples that negatively affect prediction quality. In this work, we introduce a filtering procedure that removes from the TTA images all the OOD samples whose representations lie far from the training data distribution. Moreover, all the retained TTA images are weighted inversely to their distance from the training data. The final prediction is provided by a Self-Ensemble with Confidence, which is a lightweight ensemble strategy that fuses predictions from the original and retained TTA samples using a weighted soft voting scheme, without requiring multiple models or retraining. This method is model-agnostic and can be integrated with any deep learning architecture, making it broadly applicable across various domains. Experiments on cross-domain leukocyte classification benchmarks demonstrate that our method consistently improves over standard TTA and Baseline inference, particularly when strong domain shifts are present. Ablation studies and statistical tests confirm the effectiveness and significance of each component.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 9","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12470409/pdf/","citationCount":"0","resultStr":"{\"title\":\"Test-Time Augmentation for Cross-Domain Leukocyte Classification via OOD Filtering and Self-Ensembling.\",\"authors\":\"Lorenzo Putzu, Andrea Loddo, Cecilia Di Ruberto\",\"doi\":\"10.3390/jimaging11090295\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Domain shift poses a major challenge in many Machine Learning applications due to variations in data acquisition protocols, particularly in the medical field. Test-time augmentation (TTA) can solve the domain shift issue and improve robustness by aggregating predictions from multiple augmented versions of the same input. However, TTA may inadvertently generate unrealistic or Out-of-Distribution (OOD) samples that negatively affect prediction quality. In this work, we introduce a filtering procedure that removes from the TTA images all the OOD samples whose representations lie far from the training data distribution. Moreover, all the retained TTA images are weighted inversely to their distance from the training data. The final prediction is provided by a Self-Ensemble with Confidence, which is a lightweight ensemble strategy that fuses predictions from the original and retained TTA samples using a weighted soft voting scheme, without requiring multiple models or retraining. This method is model-agnostic and can be integrated with any deep learning architecture, making it broadly applicable across various domains. Experiments on cross-domain leukocyte classification benchmarks demonstrate that our method consistently improves over standard TTA and Baseline inference, particularly when strong domain shifts are present. Ablation studies and statistical tests confirm the effectiveness and significance of each component.</p>\",\"PeriodicalId\":37035,\"journal\":{\"name\":\"Journal of Imaging\",\"volume\":\"11 9\",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2025-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12470409/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Imaging\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/jimaging11090295\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/jimaging11090295","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
由于数据采集协议的变化,特别是在医疗领域,领域转移对许多机器学习应用构成了重大挑战。测试时间增强(TTA)可以解决域移位问题,并通过聚合来自相同输入的多个增强版本的预测来提高鲁棒性。然而,TTA可能会无意中产生不现实或偏离分布(OOD)的样本,从而对预测质量产生负面影响。在这项工作中,我们引入了一个过滤过程,从TTA图像中去除所有表示远离训练数据分布的OOD样本。此外,所有保留的TTA图像的权重与它们到训练数据的距离成反比。最终的预测是由Self-Ensemble with Confidence提供的,这是一种轻量级的集成策略,它使用加权软投票方案融合了原始和保留的TTA样本的预测,而不需要多个模型或重新训练。这种方法与模型无关,可以与任何深度学习体系结构集成,使其广泛适用于各个领域。跨域白细胞分类基准的实验表明,我们的方法始终优于标准的TTA和基线推断,特别是当存在强烈的域移位时。消融研究和统计试验证实了每个组成部分的有效性和意义。
Test-Time Augmentation for Cross-Domain Leukocyte Classification via OOD Filtering and Self-Ensembling.
Domain shift poses a major challenge in many Machine Learning applications due to variations in data acquisition protocols, particularly in the medical field. Test-time augmentation (TTA) can solve the domain shift issue and improve robustness by aggregating predictions from multiple augmented versions of the same input. However, TTA may inadvertently generate unrealistic or Out-of-Distribution (OOD) samples that negatively affect prediction quality. In this work, we introduce a filtering procedure that removes from the TTA images all the OOD samples whose representations lie far from the training data distribution. Moreover, all the retained TTA images are weighted inversely to their distance from the training data. The final prediction is provided by a Self-Ensemble with Confidence, which is a lightweight ensemble strategy that fuses predictions from the original and retained TTA samples using a weighted soft voting scheme, without requiring multiple models or retraining. This method is model-agnostic and can be integrated with any deep learning architecture, making it broadly applicable across various domains. Experiments on cross-domain leukocyte classification benchmarks demonstrate that our method consistently improves over standard TTA and Baseline inference, particularly when strong domain shifts are present. Ablation studies and statistical tests confirm the effectiveness and significance of each component.