Re-identification from histopathology images

IF 10.7 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Jonathan Ganz , Jonas Ammeling , Samir Jabari , Katharina Breininger , Marc Aubreville
{"title":"Re-identification from histopathology images","authors":"Jonathan Ganz ,&nbsp;Jonas Ammeling ,&nbsp;Samir Jabari ,&nbsp;Katharina Breininger ,&nbsp;Marc Aubreville","doi":"10.1016/j.media.2024.103335","DOIUrl":null,"url":null,"abstract":"<div><div>In numerous studies, deep learning algorithms have proven their potential for the analysis of histopathology images, for example, for revealing the subtypes of tumors or the primary origin of metastases. These models require large datasets for training, which must be anonymized to prevent possible patient identity leaks. This study demonstrates that even relatively simple deep learning algorithms can re-identify patients in large histopathology datasets with substantial accuracy. In addition, we compared a comprehensive set of state-of-the-art whole slide image classifiers and feature extractors for the given task. We evaluated our algorithms on two TCIA datasets including lung squamous cell carcinoma (LSCC) and lung adenocarcinoma (LUAD). We also demonstrate the algorithm’s performance on an in-house dataset of meningioma tissue. We predicted the source patient of a slide with <span><math><msub><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span> scores of up to 80.1% and 77.19% on the LSCC and LUAD datasets, respectively, and with 77.09% on our meningioma dataset. Based on our findings, we formulated a risk assessment scheme to estimate the risk to the patient’s privacy prior to publication.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":"99 ","pages":"Article 103335"},"PeriodicalIF":10.7000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1361841524002603/pdfft?md5=6efea46ba696d683bc55409496e68f7b&pid=1-s2.0-S1361841524002603-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841524002603","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In numerous studies, deep learning algorithms have proven their potential for the analysis of histopathology images, for example, for revealing the subtypes of tumors or the primary origin of metastases. These models require large datasets for training, which must be anonymized to prevent possible patient identity leaks. This study demonstrates that even relatively simple deep learning algorithms can re-identify patients in large histopathology datasets with substantial accuracy. In addition, we compared a comprehensive set of state-of-the-art whole slide image classifiers and feature extractors for the given task. We evaluated our algorithms on two TCIA datasets including lung squamous cell carcinoma (LSCC) and lung adenocarcinoma (LUAD). We also demonstrate the algorithm’s performance on an in-house dataset of meningioma tissue. We predicted the source patient of a slide with F1 scores of up to 80.1% and 77.19% on the LSCC and LUAD datasets, respectively, and with 77.09% on our meningioma dataset. Based on our findings, we formulated a risk assessment scheme to estimate the risk to the patient’s privacy prior to publication.

Abstract Image

组织病理学图像的再识别
在大量研究中,深度学习算法证明了其在组织病理学图像分析方面的潜力,例如,用于揭示肿瘤的亚型或转移瘤的原发来源。这些模型需要大量数据集进行训练,这些数据集必须进行匿名处理,以防止可能出现的患者身份泄露。本研究证明,即使是相对简单的深度学习算法,也能在大型组织病理学数据集中重新识别患者,而且准确率很高。此外,我们还针对给定任务比较了一整套最先进的全切片图像分类器和特征提取器。我们在两个 TCIA 数据集上评估了我们的算法,包括肺鳞状细胞癌(LSCC)和肺腺癌(LUAD)。我们还展示了算法在脑膜瘤组织内部数据集上的性能。在 LSCC 和 LUAD 数据集上,我们预测玻片来源患者的 F1 分数分别高达 80.1% 和 77.19%,而在脑膜瘤数据集上则高达 77.09%。根据我们的研究结果,我们制定了一个风险评估方案,以估算发表前患者隐私所面临的风险。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Medical image analysis
Medical image analysis 工程技术-工程:生物医学
CiteScore
22.10
自引率
6.40%
发文量
309
审稿时长
6.6 months
期刊介绍: Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信