{"title":"人工智能驱动的肾小球形态量化:一种评估肾脏疾病基底膜厚度和足细胞足突消失的新途径","authors":"Michifumi Yamashita , Natalia Piaseczna , Akira Takahashi , Daisuke Kiyozawa , Narihito Tatsumoto , Shohei Kaneko , Natalia Zurek , Arkadiusz Gertych","doi":"10.1016/j.cmpb.2025.108842","DOIUrl":null,"url":null,"abstract":"<div><h3>Background and Objective</h3><div>Measuring the thickness of the glomerular basement membrane (GBM) and assessing the percentage of podocyte foot process effacement (%PFPE) are important for diagnosing non-neoplastic kidney diseases. However, when performed manually by nephropathologists using electron microscopy (EM) images, these assessments are hindered by the lack of universally standardized guidelines, leading to technical challenges. We have developed a novel deep learning (DL)-based pipeline which has the potential to reduce human error and enhance the consistency and efficiency of GBMs and %PFPE quantifications.</div></div><div><h3>Methods</h3><div>This study utilized 196 EM images from kidney biopsies (representing 21 different kidney diseases from 83 subjects) which were manually annotated by consensus of 3 nephrologists and 2 nephropathologist providing ground truth (GT) masks of GBMs, podocytes, red blood cells and other glomerular ultrastructures. Of these, 165 images were used to develop two DL models (DeepLabV3+ and U-Net architectures) for EM image segmentation. Subsequently, the models were evaluated on the remaining 31 images and compared for segmentation accuracy, and the predicted GBM and podocyte masks were analyzed by algorithms in the pipeline which automatically measured the corrected harmonic mean of GBM thickness (cmGBM) and estimated the %PFPE. The automated measurements were statistically compared to the corresponding cmGBM measured and %PFPE estimated using the consensus GBM and podocyte GT masks. The goal was to identify differences between measurements provided by these three methods. Statistical evaluations were carried out using the intraclass correlation coefficient (ICC), and the Bland-Altman plots estimating the bias and limits of agreement (LoAs) between the GT and DL mask-based measurements.</div></div><div><h3>Results</h3><div>In the 31 test set images, the DeepLabV3+ model achieved a global accuracy (gACC) of 92.8 % and a weighted intersection over union (wIoU) of 0.869, outperforming the U-Net model, which recorded a gACC of 88.9 % and a wIoU of 0.800. For GBM thickness measurements, the cmGBM derived from DeepLabV3+ masks exhibited excellent agreement with GT-masks based measurements (ICC = 0.991, <em>p</em> < 0.001), whereas the U-Net model showed good agreement (ICC = 0.881, <em>p</em> < 0.001). The %PFPE estimates obtained using the DL-generated podocyte masks were highly consistent with those based on GT, with ICC values of 0.926 and 0.928 for DeepLabV3+ and U-Net, respectively. The Bland-Altman plots revealed a positive bias in the cmGBM and %PFPE obtained from the masks generated by the DeepLabV3+ model, and negative bias in the cmGBM and %PFPE obtained from the masks generated by the U-Net model. However, the DeepLabV3+ masks provided narrower LoA ranges than the U-Net masks for measuring cmGBM.</div></div><div><h3>Conclusions</h3><div>This study highlights the potential of AI to address the limitations of manual assessments of glomerular ultrastructures in EM images by providing comprehensive, objective and accurate measurements of GBM thickness and %PFPE estimates. Our pipeline with DeepLabV3+ demonstrated robust EM image segmentation efficiency and excellent reliability of measurements when compared to expert ground truth. Further refinement of this AI-driven method for advancing the diagnostic capabilities and standardization of AI in nephropathology is warranted.</div></div>","PeriodicalId":10624,"journal":{"name":"Computer methods and programs in biomedicine","volume":"268 ","pages":"Article 108842"},"PeriodicalIF":4.9000,"publicationDate":"2025-05-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"AI-driven glomerular morphology quantification: a novel pipeline for assessing basement membrane thickness and podocyte foot process effacement in kidney diseases\",\"authors\":\"Michifumi Yamashita , Natalia Piaseczna , Akira Takahashi , Daisuke Kiyozawa , Narihito Tatsumoto , Shohei Kaneko , Natalia Zurek , Arkadiusz Gertych\",\"doi\":\"10.1016/j.cmpb.2025.108842\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background and Objective</h3><div>Measuring the thickness of the glomerular basement membrane (GBM) and assessing the percentage of podocyte foot process effacement (%PFPE) are important for diagnosing non-neoplastic kidney diseases. However, when performed manually by nephropathologists using electron microscopy (EM) images, these assessments are hindered by the lack of universally standardized guidelines, leading to technical challenges. We have developed a novel deep learning (DL)-based pipeline which has the potential to reduce human error and enhance the consistency and efficiency of GBMs and %PFPE quantifications.</div></div><div><h3>Methods</h3><div>This study utilized 196 EM images from kidney biopsies (representing 21 different kidney diseases from 83 subjects) which were manually annotated by consensus of 3 nephrologists and 2 nephropathologist providing ground truth (GT) masks of GBMs, podocytes, red blood cells and other glomerular ultrastructures. Of these, 165 images were used to develop two DL models (DeepLabV3+ and U-Net architectures) for EM image segmentation. Subsequently, the models were evaluated on the remaining 31 images and compared for segmentation accuracy, and the predicted GBM and podocyte masks were analyzed by algorithms in the pipeline which automatically measured the corrected harmonic mean of GBM thickness (cmGBM) and estimated the %PFPE. The automated measurements were statistically compared to the corresponding cmGBM measured and %PFPE estimated using the consensus GBM and podocyte GT masks. The goal was to identify differences between measurements provided by these three methods. Statistical evaluations were carried out using the intraclass correlation coefficient (ICC), and the Bland-Altman plots estimating the bias and limits of agreement (LoAs) between the GT and DL mask-based measurements.</div></div><div><h3>Results</h3><div>In the 31 test set images, the DeepLabV3+ model achieved a global accuracy (gACC) of 92.8 % and a weighted intersection over union (wIoU) of 0.869, outperforming the U-Net model, which recorded a gACC of 88.9 % and a wIoU of 0.800. For GBM thickness measurements, the cmGBM derived from DeepLabV3+ masks exhibited excellent agreement with GT-masks based measurements (ICC = 0.991, <em>p</em> < 0.001), whereas the U-Net model showed good agreement (ICC = 0.881, <em>p</em> < 0.001). The %PFPE estimates obtained using the DL-generated podocyte masks were highly consistent with those based on GT, with ICC values of 0.926 and 0.928 for DeepLabV3+ and U-Net, respectively. The Bland-Altman plots revealed a positive bias in the cmGBM and %PFPE obtained from the masks generated by the DeepLabV3+ model, and negative bias in the cmGBM and %PFPE obtained from the masks generated by the U-Net model. However, the DeepLabV3+ masks provided narrower LoA ranges than the U-Net masks for measuring cmGBM.</div></div><div><h3>Conclusions</h3><div>This study highlights the potential of AI to address the limitations of manual assessments of glomerular ultrastructures in EM images by providing comprehensive, objective and accurate measurements of GBM thickness and %PFPE estimates. Our pipeline with DeepLabV3+ demonstrated robust EM image segmentation efficiency and excellent reliability of measurements when compared to expert ground truth. Further refinement of this AI-driven method for advancing the diagnostic capabilities and standardization of AI in nephropathology is warranted.</div></div>\",\"PeriodicalId\":10624,\"journal\":{\"name\":\"Computer methods and programs in biomedicine\",\"volume\":\"268 \",\"pages\":\"Article 108842\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-05-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer methods and programs in biomedicine\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0169260725002597\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer methods and programs in biomedicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169260725002597","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
摘要
背景与目的测定肾小球基底膜(GBM)厚度和足细胞足突消失率(%PFPE)对诊断非肿瘤性肾病具有重要意义。然而,当肾脏病理学家使用电子显微镜(EM)图像手动执行这些评估时,由于缺乏普遍标准化的指南,这些评估受到阻碍,导致技术挑战。我们开发了一种新的基于深度学习(DL)的管道,它有可能减少人为错误,提高GBMs和%PFPE量化的一致性和效率。方法本研究利用83例患者肾活检的196张EM图像(代表21种不同的肾脏疾病),由3名肾病学家和2名肾病病理学家共同手工注释,提供GBMs、足细胞、红细胞和其他肾小球超微结构的ground truth (GT)面具。其中,165张图像被用于开发两种深度学习模型(DeepLabV3+和U-Net架构),用于EM图像分割。随后,在剩余的31张图像上对模型进行评估并比较分割精度,并通过管道中自动测量GBM厚度的校正谐波平均值(cmGBM)和估计PFPE %的算法对预测的GBM和足细胞掩模进行分析。将自动测量结果与相应的cmGBM测量结果和使用共识GBM和足细胞GT面罩估计的%PFPE进行统计比较。目的是确定这三种方法提供的测量值之间的差异。使用类内相关系数(ICC)和Bland-Altman图进行统计评估,估计基于GT和DL掩模的测量之间的偏差和一致限度(LoAs)。结果在31张测试集图像中,DeepLabV3+模型的global accuracy (gACC)为92.8%,加权交集比union (wIoU)为0.869,优于U-Net模型的gACC为88.9%,wIoU为0.800。对于GBM厚度测量,由DeepLabV3+掩膜导出的cmGBM与基于gt掩膜的测量结果表现出极好的一致性(ICC = 0.991, p <;0.001),而U-Net模型显示出很好的一致性(ICC = 0.881, p <;0.001)。使用dl生成的足细胞掩膜获得的%PFPE估计与基于GT的估计高度一致,DeepLabV3+和U-Net的ICC值分别为0.926和0.928。Bland-Altman图显示,从DeepLabV3+模型生成的掩模中获得的cmGBM和%PFPE存在正偏倚,而从U-Net模型生成的掩模中获得的cmGBM和%PFPE存在负偏倚。然而,DeepLabV3+掩模在测量cmGBM时提供的LoA范围比U-Net掩模更窄。本研究强调了人工智能的潜力,通过提供全面、客观和准确的GBM厚度测量和%PFPE估计,解决了EM图像中人工评估肾小球超微结构的局限性。与专家地面真值相比,我们的DeepLabV3+管道显示出强大的EM图像分割效率和出色的测量可靠性。进一步完善这种人工智能驱动的方法,以提高肾脏病理学中人工智能的诊断能力和标准化是必要的。
AI-driven glomerular morphology quantification: a novel pipeline for assessing basement membrane thickness and podocyte foot process effacement in kidney diseases
Background and Objective
Measuring the thickness of the glomerular basement membrane (GBM) and assessing the percentage of podocyte foot process effacement (%PFPE) are important for diagnosing non-neoplastic kidney diseases. However, when performed manually by nephropathologists using electron microscopy (EM) images, these assessments are hindered by the lack of universally standardized guidelines, leading to technical challenges. We have developed a novel deep learning (DL)-based pipeline which has the potential to reduce human error and enhance the consistency and efficiency of GBMs and %PFPE quantifications.
Methods
This study utilized 196 EM images from kidney biopsies (representing 21 different kidney diseases from 83 subjects) which were manually annotated by consensus of 3 nephrologists and 2 nephropathologist providing ground truth (GT) masks of GBMs, podocytes, red blood cells and other glomerular ultrastructures. Of these, 165 images were used to develop two DL models (DeepLabV3+ and U-Net architectures) for EM image segmentation. Subsequently, the models were evaluated on the remaining 31 images and compared for segmentation accuracy, and the predicted GBM and podocyte masks were analyzed by algorithms in the pipeline which automatically measured the corrected harmonic mean of GBM thickness (cmGBM) and estimated the %PFPE. The automated measurements were statistically compared to the corresponding cmGBM measured and %PFPE estimated using the consensus GBM and podocyte GT masks. The goal was to identify differences between measurements provided by these three methods. Statistical evaluations were carried out using the intraclass correlation coefficient (ICC), and the Bland-Altman plots estimating the bias and limits of agreement (LoAs) between the GT and DL mask-based measurements.
Results
In the 31 test set images, the DeepLabV3+ model achieved a global accuracy (gACC) of 92.8 % and a weighted intersection over union (wIoU) of 0.869, outperforming the U-Net model, which recorded a gACC of 88.9 % and a wIoU of 0.800. For GBM thickness measurements, the cmGBM derived from DeepLabV3+ masks exhibited excellent agreement with GT-masks based measurements (ICC = 0.991, p < 0.001), whereas the U-Net model showed good agreement (ICC = 0.881, p < 0.001). The %PFPE estimates obtained using the DL-generated podocyte masks were highly consistent with those based on GT, with ICC values of 0.926 and 0.928 for DeepLabV3+ and U-Net, respectively. The Bland-Altman plots revealed a positive bias in the cmGBM and %PFPE obtained from the masks generated by the DeepLabV3+ model, and negative bias in the cmGBM and %PFPE obtained from the masks generated by the U-Net model. However, the DeepLabV3+ masks provided narrower LoA ranges than the U-Net masks for measuring cmGBM.
Conclusions
This study highlights the potential of AI to address the limitations of manual assessments of glomerular ultrastructures in EM images by providing comprehensive, objective and accurate measurements of GBM thickness and %PFPE estimates. Our pipeline with DeepLabV3+ demonstrated robust EM image segmentation efficiency and excellent reliability of measurements when compared to expert ground truth. Further refinement of this AI-driven method for advancing the diagnostic capabilities and standardization of AI in nephropathology is warranted.
期刊介绍:
To encourage the development of formal computing methods, and their application in biomedical research and medical practice, by illustration of fundamental principles in biomedical informatics research; to stimulate basic research into application software design; to report the state of research of biomedical information processing projects; to report new computer methodologies applied in biomedical areas; the eventual distribution of demonstrable software to avoid duplication of effort; to provide a forum for discussion and improvement of existing software; to optimize contact between national organizations and regional user groups by promoting an international exchange of information on formal methods, standards and software in biomedicine.
Computer Methods and Programs in Biomedicine covers computing methodology and software systems derived from computing science for implementation in all aspects of biomedical research and medical practice. It is designed to serve: biochemists; biologists; geneticists; immunologists; neuroscientists; pharmacologists; toxicologists; clinicians; epidemiologists; psychiatrists; psychologists; cardiologists; chemists; (radio)physicists; computer scientists; programmers and systems analysts; biomedical, clinical, electrical and other engineers; teachers of medical informatics and users of educational software.