Study Protocol: Development and Retrospective Validation of an Artificial Intelligence System for Diagnostic Assessment of Prostate Biopsies

Nita Mulliqi, Anders Blilie, Xiaoyi Ji, Kelvin Szolnoky, Henrik Olsson, Matteo Titus, Geraldine Martinez Gonzalez, Sol Erika Boman, Masi Valkonen, Einar Gudlaugsson, Svein R Kjosavik, Jose Asenjo, Marcello Gambacorta, Paolo Libretti, Marcin Braun, Radzislaw Kordek, Roman Lowicki, Kristina Hotakainen, Paivi Vare, Bodil Ginnerup Pedersen, Karina Dalsgaard Sorensen, Benedicte Parm Ulhoi, Mattias Rantalainen, Pekka Ruusuvuori, Brett Delahunt, Hemamali Samaratunga, Toyonori Tsuzuki, Emilius A.M. Janssen, Lars Egevad, Kimmo Kartasalo, Martin Eklund
{"title":"Study Protocol: Development and Retrospective Validation of an Artificial Intelligence System for Diagnostic Assessment of Prostate Biopsies","authors":"Nita Mulliqi, Anders Blilie, Xiaoyi Ji, Kelvin Szolnoky, Henrik Olsson, Matteo Titus, Geraldine Martinez Gonzalez, Sol Erika Boman, Masi Valkonen, Einar Gudlaugsson, Svein R Kjosavik, Jose Asenjo, Marcello Gambacorta, Paolo Libretti, Marcin Braun, Radzislaw Kordek, Roman Lowicki, Kristina Hotakainen, Paivi Vare, Bodil Ginnerup Pedersen, Karina Dalsgaard Sorensen, Benedicte Parm Ulhoi, Mattias Rantalainen, Pekka Ruusuvuori, Brett Delahunt, Hemamali Samaratunga, Toyonori Tsuzuki, Emilius A.M. Janssen, Lars Egevad, Kimmo Kartasalo, Martin Eklund","doi":"10.1101/2024.07.04.24309948","DOIUrl":null,"url":null,"abstract":"Histopathological evaluation of prostate biopsies using the Gleason scoring system is critical for prostate cancer diagnosis and treatment selection. However, grading variability among pathologists can lead to inconsistent assessments, risking inappropriate treatment. Similar challenges complicate the assessment of other prognostic features like cribriform cancer morphology and perineural invasion. Many pathology departments are also facing an increasingly unsustainable workload due to rising prostate cancer incidence and a decreasing pathologist workforce coinciding with increasing requirements for more complex assessments and reporting. Digital pathology and artificial intelligence (AI) algorithms for analysing whole slide images (WSI) show promise in improving the accuracy and efficiency of histopathological assessments. Studies have demonstrated AI's capability to diagnose and grade prostate cancer comparably to expert pathologists. However, external validations on diverse data sets have been limited and often show reduced performance. Historically, there have been no well-established guidelines for AI study designs and validation methods. Diagnostic assessments of AI systems often lack pre-registered protocols and rigorous external cohort sampling, essential for reliable evidence of their safety and accuracy. This study protocol covers the retrospective validation of an AI system for prostate biopsy assessment. The primary objective of the study is to develop a high-performing and robust AI model for diagnosis and Gleason scoring of prostate cancer in core needle biopsies, and at scale evaluate whether it can generalise to fully external data from independent patients, pathology laboratories, and digitalisation platforms. The secondary objectives cover AI performance in estimating cancer extent and in detecting cribriform prostate cancer and perineural invasion. This protocol outlines the steps for data collection, predefined partitioning of data cohorts for AI model training and validation, model development, and predetermined statistical analyses, ensuring systematic development and comprehensive validation of the system. The protocol adheres to TRIPOD+AI, PIECES, CLAIM, and other relevant best practices.","PeriodicalId":501528,"journal":{"name":"medRxiv - Pathology","volume":"6 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"medRxiv - Pathology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.07.04.24309948","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Histopathological evaluation of prostate biopsies using the Gleason scoring system is critical for prostate cancer diagnosis and treatment selection. However, grading variability among pathologists can lead to inconsistent assessments, risking inappropriate treatment. Similar challenges complicate the assessment of other prognostic features like cribriform cancer morphology and perineural invasion. Many pathology departments are also facing an increasingly unsustainable workload due to rising prostate cancer incidence and a decreasing pathologist workforce coinciding with increasing requirements for more complex assessments and reporting. Digital pathology and artificial intelligence (AI) algorithms for analysing whole slide images (WSI) show promise in improving the accuracy and efficiency of histopathological assessments. Studies have demonstrated AI's capability to diagnose and grade prostate cancer comparably to expert pathologists. However, external validations on diverse data sets have been limited and often show reduced performance. Historically, there have been no well-established guidelines for AI study designs and validation methods. Diagnostic assessments of AI systems often lack pre-registered protocols and rigorous external cohort sampling, essential for reliable evidence of their safety and accuracy. This study protocol covers the retrospective validation of an AI system for prostate biopsy assessment. The primary objective of the study is to develop a high-performing and robust AI model for diagnosis and Gleason scoring of prostate cancer in core needle biopsies, and at scale evaluate whether it can generalise to fully external data from independent patients, pathology laboratories, and digitalisation platforms. The secondary objectives cover AI performance in estimating cancer extent and in detecting cribriform prostate cancer and perineural invasion. This protocol outlines the steps for data collection, predefined partitioning of data cohorts for AI model training and validation, model development, and predetermined statistical analyses, ensuring systematic development and comprehensive validation of the system. The protocol adheres to TRIPOD+AI, PIECES, CLAIM, and other relevant best practices.
研究方案:前列腺活检诊断评估人工智能系统的开发与回顾性验证
使用格里森评分系统对前列腺活检组织病理学进行评估,对于前列腺癌的诊断和治疗选择至关重要。然而,病理学家之间的分级差异会导致评估结果不一致,从而带来治疗不当的风险。类似的挑战也使对其他预后特征(如楔形癌形态和神经周围侵犯)的评估复杂化。由于前列腺癌发病率不断上升,病理学家队伍不断减少,同时对更复杂评估和报告的要求不断提高,许多病理部门也面临着越来越难以承受的工作量。用于分析全切片图像(WSI)的数字病理学和人工智能(AI)算法有望提高组织病理学评估的准确性和效率。研究表明,人工智能对前列腺癌的诊断和分级能力可与病理专家媲美。然而,对不同数据集的外部验证非常有限,而且往往显示出性能下降。一直以来,人工智能研究设计和验证方法都没有完善的指导方针。人工智能系统的诊断评估往往缺乏预先注册的方案和严格的外部队列抽样,而这对于可靠地证明其安全性和准确性至关重要。本研究方案包括对用于前列腺活检评估的人工智能系统进行回顾性验证。研究的首要目标是开发一种高性能、稳健的人工智能模型,用于核心针活检中前列腺癌的诊断和格里森评分,并大规模评估该模型是否能推广到来自独立患者、病理实验室和数字化平台的全部外部数据。次要目标包括人工智能在估计癌症范围、检测楔形前列腺癌和神经周围侵犯方面的性能。该方案概述了数据收集步骤、用于人工智能模型训练和验证的预定义数据队列分区、模型开发和预定统计分析,以确保系统的系统开发和全面验证。该方案遵循 TRIPOD+AI、PIECES、CLAIM 和其他相关最佳实践。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信