Tumour purity assessment with deep learning in colorectal cancer and impact on molecular analysis
IF 5.6
2区 医学
Q1 ONCOLOGY
Lydia A Schoenpflug, Aikaterini Chatzipli, Korsuk Sirinukunwattana, Susan Richman, Andrew Blake, James Robineau, Kirsten D Mertz, Clare Verrill, Simon J Leedham, Claire Hardy, Celina Whalley, Keara Redmond, Philip Dunne, Steven Walker, Andrew D Beggs, Ultan McDermott, Graeme I Murray, Leslie M Samuel, Matthew Seymour, Ian Tomlinson, Philip Quirke, S:CORT consortium, Jens Rittscher, Tim Maughan, Enric Domingo, Viktor H Koelzer
下载PDF
{"title":"Tumour purity assessment with deep learning in colorectal cancer and impact on molecular analysis","authors":"Lydia A Schoenpflug, Aikaterini Chatzipli, Korsuk Sirinukunwattana, Susan Richman, Andrew Blake, James Robineau, Kirsten D Mertz, Clare Verrill, Simon J Leedham, Claire Hardy, Celina Whalley, Keara Redmond, Philip Dunne, Steven Walker, Andrew D Beggs, Ultan McDermott, Graeme I Murray, Leslie M Samuel, Matthew Seymour, Ian Tomlinson, Philip Quirke, S:CORT consortium, Jens Rittscher, Tim Maughan, Enric Domingo, Viktor H Koelzer","doi":"10.1002/path.6376","DOIUrl":null,"url":null,"abstract":"<p>Tumour content plays a pivotal role in directing the bioinformatic analysis of molecular profiles such as copy number variation (CNV). In clinical application, tumour purity estimation (TPE) is achieved either through visual pathological review [conventional pathology (CP)] or the deconvolution of molecular data. While CP provides a direct measurement, it demonstrates modest reproducibility and lacks standardisation. Conversely, deconvolution methods offer an indirect assessment with uncertain accuracy, underscoring the necessity for innovative approaches. SoftCTM is an open-source, multiorgan deep-learning (DL) model for the detection of tumour and non-tumour cells in H&E-stained slides, developed within the Overlapped Cell on Tissue Dataset for Histopathology (OCELOT) Challenge 2023. Here, using three large multicentre colorectal cancer (CRC) cohorts (<i>N</i> = 1,097 patients) with digital pathology and multi-omic data, we compare the utility and accuracy of TPE with SoftCTM versus CP and bioinformatic deconvolution methods (RNA expression, DNA methylation) for downstream molecular analysis, including CNV profiling. SoftCTM showed technical repeatability when applied twice on the same slide (<i>r</i> = 1.0) and excellent correlations in paired H&E slides (<i>r</i> > 0.9). TPEs profiled by SoftCTM correlated highly with RNA expression (<i>r</i> = 0.59) and DNA methylation (<i>r</i> = 0.40), while TPEs by CP showed a lower correlation with RNA expression (<i>r</i> = 0.41) and DNA methylation (<i>r</i> = 0.29). We show that CP and deconvolution methods respectively underestimate and overestimate tumour content compared to SoftCTM, resulting in 6–13% differing CNV calls. In summary, TPE with SoftCTM enables reproducibility, automation, and standardisation at single-cell resolution. SoftCTM estimates (<i>M</i> = 58.9%, SD ±16.3%) reconcile the overestimation by molecular data extrapolation (RNA expression: <i>M</i> = 79.2%, SD ±10.5, DNA methylation: <i>M</i> = 62.7%, SD ±11.8%) and underestimation by CP (<i>M</i> = 35.9%, SD ±13.1%), providing a more reliable middle ground. A fully integrated computational pathology solution could therefore be used to improve downstream molecular analyses for research and clinics. © 2024 The Author(s). <i>The Journal of Pathology</i> published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.</p>","PeriodicalId":232,"journal":{"name":"The Journal of Pathology","volume":"265 2","pages":"184-197"},"PeriodicalIF":5.6000,"publicationDate":"2024-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11717495/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Journal of Pathology","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/path.6376","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0
引用
批量引用
Abstract
Tumour content plays a pivotal role in directing the bioinformatic analysis of molecular profiles such as copy number variation (CNV). In clinical application, tumour purity estimation (TPE) is achieved either through visual pathological review [conventional pathology (CP)] or the deconvolution of molecular data. While CP provides a direct measurement, it demonstrates modest reproducibility and lacks standardisation. Conversely, deconvolution methods offer an indirect assessment with uncertain accuracy, underscoring the necessity for innovative approaches. SoftCTM is an open-source, multiorgan deep-learning (DL) model for the detection of tumour and non-tumour cells in H&E-stained slides, developed within the Overlapped Cell on Tissue Dataset for Histopathology (OCELOT) Challenge 2023. Here, using three large multicentre colorectal cancer (CRC) cohorts (N = 1,097 patients) with digital pathology and multi-omic data, we compare the utility and accuracy of TPE with SoftCTM versus CP and bioinformatic deconvolution methods (RNA expression, DNA methylation) for downstream molecular analysis, including CNV profiling. SoftCTM showed technical repeatability when applied twice on the same slide (r = 1.0) and excellent correlations in paired H&E slides (r > 0.9). TPEs profiled by SoftCTM correlated highly with RNA expression (r = 0.59) and DNA methylation (r = 0.40), while TPEs by CP showed a lower correlation with RNA expression (r = 0.41) and DNA methylation (r = 0.29). We show that CP and deconvolution methods respectively underestimate and overestimate tumour content compared to SoftCTM, resulting in 6–13% differing CNV calls. In summary, TPE with SoftCTM enables reproducibility, automation, and standardisation at single-cell resolution. SoftCTM estimates (M = 58.9%, SD ±16.3%) reconcile the overestimation by molecular data extrapolation (RNA expression: M = 79.2%, SD ±10.5, DNA methylation: M = 62.7%, SD ±11.8%) and underestimation by CP (M = 35.9%, SD ±13.1%), providing a more reliable middle ground. A fully integrated computational pathology solution could therefore be used to improve downstream molecular analyses for research and clinics. © 2024 The Author(s). The Journal of Pathology published by John Wiley & Sons Ltd on behalf of The Pathological Society of Great Britain and Ireland.
基于深度学习的结直肠癌肿瘤纯度评估及其对分子分析的影响。
肿瘤内容在指导拷贝数变异(CNV)等分子谱的生物信息学分析中起着关键作用。在临床应用中,肿瘤纯度估计(TPE)是通过视觉病理检查[常规病理(CP)]或分子数据的反褶积来实现的。虽然CP提供了一种直接的测量方法,但它的可重复性不高,缺乏标准化。相反,反褶积方法提供了一种不确定准确性的间接评估,强调了创新方法的必要性。SoftCTM是一个开源的多器官深度学习(DL)模型,用于检测h&e染色玻片中的肿瘤和非肿瘤细胞,该模型是在组织病理学组织数据集重叠细胞(OCELOT)挑战2023中开发的。在这里,使用三个大型多中心结直肠癌(CRC)队列(N = 1,097例患者)的数字病理和多组学数据,我们比较了TPE与SoftCTM与CP和生物信息学反褶积方法(RNA表达,DNA甲基化)在下游分子分析(包括CNV分析)中的实用性和准确性。SoftCTM在同一载玻片上应用两次时显示出技术重复性(r = 1.0),在成对的H&E载玻片上显示出极好的相关性(r > 0.9)。SoftCTM检测的TPEs与RNA表达(r = 0.59)和DNA甲基化(r = 0.40)高度相关,而CP检测的TPEs与RNA表达(r = 0.41)和DNA甲基化(r = 0.29)相关性较低。我们发现,与SoftCTM相比,CP和反卷积方法分别低估和高估了肿瘤含量,导致6-13%的CNV调用差异。总之,使用SoftCTM的TPE可以在单细胞分辨率下实现再现性、自动化和标准化。SoftCTM估计(M = 58.9%, SD±16.3%)调和了分子数据外推的高估(RNA表达:M = 79.2%, SD±10.5,DNA甲基化:M = 62.7%, SD±11.8%)和CP的低估(M = 35.9%, SD±13.1%),提供了更可靠的中间地带。因此,一个完全集成的计算病理学解决方案可以用于改善研究和临床的下游分子分析。©2024作者。《病理学杂志》由John Wiley & Sons Ltd代表大不列颠和爱尔兰病理学会出版。
本文章由计算机程序翻译,如有差异,请以英文原文为准。