Unsupervised learning for labeling global glomerulosclerosis

bioRxiv - Pathology Pub Date : 2024-09-03 DOI:10.1101/2024.09.01.610244

Hrafn Weishaupt, Justinas Besusparis, Cleo-Aron Weis, Stefan Porubsky, Arvydas Laurinavicius, Sabine Leh

{"title":"Unsupervised learning for labeling global glomerulosclerosis","authors":"Hrafn Weishaupt, Justinas Besusparis, Cleo-Aron Weis, Stefan Porubsky, Arvydas Laurinavicius, Sabine Leh","doi":"10.1101/2024.09.01.610244","DOIUrl":null,"url":null,"abstract":"Current deep learning models for classifying glomeruli in nephropathology are trained almost exclusively in a supervised manner, requiring expert-labeled images. Very little is known about the potential for unsupervised learning to overcome this bottleneck. To address this open question in a proof-of-concept, the project focused on the most fundamental classification task: globally sclerosed versus non-globally sclerosed glomeruli. The performance of clustering between the two classes was extensively studied across a variety of labeled datasets with diverse compositions and histological stains, and across the feature embeddings produced by 34 different pre-trained CNN models. As demonstrated by the study, clustering of globally and non-globally sclerosed glomeruli is generally highly feasible, yielding accuracies of over 95% in most datasets. Further work will be required to expand these experiments towards the clustering of additional glomerular lesion categories. We are convinced that these efforts (i) will open up opportunities for semi-automatic labeling approaches, thus alleviating the need for labor-intensive manual labeling, and (ii) illustrate that glomerular classification models can potentially be trained even in the absence of expert-derived class labels.","PeriodicalId":501471,"journal":{"name":"bioRxiv - Pathology","volume":"19 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Pathology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.01.610244","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Current deep learning models for classifying glomeruli in nephropathology are trained almost exclusively in a supervised manner, requiring expert-labeled images. Very little is known about the potential for unsupervised learning to overcome this bottleneck. To address this open question in a proof-of-concept, the project focused on the most fundamental classification task: globally sclerosed versus non-globally sclerosed glomeruli. The performance of clustering between the two classes was extensively studied across a variety of labeled datasets with diverse compositions and histological stains, and across the feature embeddings produced by 34 different pre-trained CNN models. As demonstrated by the study, clustering of globally and non-globally sclerosed glomeruli is generally highly feasible, yielding accuracies of over 95% in most datasets. Further work will be required to expand these experiments towards the clustering of additional glomerular lesion categories. We are convinced that these efforts (i) will open up opportunities for semi-automatic labeling approaches, thus alleviating the need for labor-intensive manual labeling, and (ii) illustrate that glomerular classification models can potentially be trained even in the absence of expert-derived class labels.

查看原文本刊更多论文

标记全局性肾小球硬化症的无监督学习

目前用于肾病学中肾小球分类的深度学习模型几乎完全是以监督方式进行训练的，需要专家标记的图像。人们对无监督学习克服这一瓶颈的潜力知之甚少。为了在概念验证中解决这一开放性问题，该项目重点关注最基本的分类任务：全局性硬化与非全局性硬化肾小球。研究人员在具有不同组成和组织学染色的各种标注数据集以及 34 种不同的预训练 CNN 模型所产生的特征嵌入中广泛研究了两类之间的聚类性能。研究结果表明，对全局性和非全局性硬化肾小球进行聚类一般都非常可行，在大多数数据集中的准确率都超过了 95%。还需要进一步开展工作，将这些实验扩展到其他肾小球病变类别的聚类。我们相信，这些工作（i）将为半自动标注方法提供机会，从而减轻对劳动密集型人工标注的需求，（ii）说明即使在没有专家分类标签的情况下，肾小球分类模型也有可能得到训练。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

bioRxiv - Pathology

自引率

0.00%

发文量