Concordance with CONSORT-AI guidelines in reporting of randomised controlled trials investigating artificial intelligence in oncology: a systematic review.
David Chen, Kristen Arnold, Ronesh Sukhdeo, John Farag Alla, Srinivas Raman
{"title":"Concordance with CONSORT-AI guidelines in reporting of randomised controlled trials investigating artificial intelligence in oncology: a systematic review.","authors":"David Chen, Kristen Arnold, Ronesh Sukhdeo, John Farag Alla, Srinivas Raman","doi":"10.1136/bmjonc-2025-000733","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>The advent of artificial intelligence (AI) tools in oncology to support clinical decision-making, reduce physician workload and automate workflow inefficiencies yields both great promise and caution. To generate high-quality evidence on the safety and efficacy of AI interventions, randomised controlled trials (RCTs) remain the gold standard. However, the completeness and quality of reporting among AI trials in oncology remains unknown.</p><p><strong>Objective: </strong>This systematic review investigates the reporting concordance of RCTs for AI interventions in oncology using the CONSORT (Consolidated Standards of Reporting Trials) 2010 and CONSORT-AI 2020 extension guideline and comprehensively summarises the state of AI RCTs in oncology.</p><p><strong>Methods and analysis: </strong>We queried OVID MEDLINE and Embase on 22 October 2024 using AI, cancer and RCT search terms. Studies were included if they reported on an AI intervention in an RCT including participants with cancer.</p><p><strong>Results: </strong>This study included 57 RCTs of AI interventions in oncology that were primarily focused on screening (54%) or diagnosis (19%) and intended for clinician use (88%). Among all 57 RCTs, median concordance with CONSORT 2010 and CONSORT-AI 2020 was 82%. Compared with trials published before the release of CONSORT-AI (n=8), trials published after the release of CONSORT-AI (n=49) had lower median overall CONSORT (82% vs 92%) and CONSORT 2010 (81% vs 92%) concordance but similar CONSORT-AI median concordance (93% vs 93%). Guideline items related to study methodology necessary for reproducibility using the AI intervention, such as input data inclusion and exclusion, algorithm version, low quality data handling, assessment of performance error and data accessibility, were consistently under-reported. When stratifying included trials by their overall risk of bias, trials at serious risk of bias (57%) were less concordant to CONSORT guidelines compared with trials at moderate (71%) or low (84%) risk of bias.</p><p><strong>Conclusion: </strong>Although the majority of CONSORT and CONSORT-AI items were well-reported, critical gaps related to reporting of methodology, reproducibility and harms persist. Addressing these gaps through consideration of trial design to mitigate risks of bias coupled with standardised reporting is one step towards responsible adoption of AI to improve patient outcomes in oncology.</p>","PeriodicalId":72436,"journal":{"name":"BMJ oncology","volume":"4 1","pages":"e000733"},"PeriodicalIF":0.0000,"publicationDate":"2025-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12414185/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMJ oncology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1136/bmjonc-2025-000733","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background: The advent of artificial intelligence (AI) tools in oncology to support clinical decision-making, reduce physician workload and automate workflow inefficiencies yields both great promise and caution. To generate high-quality evidence on the safety and efficacy of AI interventions, randomised controlled trials (RCTs) remain the gold standard. However, the completeness and quality of reporting among AI trials in oncology remains unknown.
Objective: This systematic review investigates the reporting concordance of RCTs for AI interventions in oncology using the CONSORT (Consolidated Standards of Reporting Trials) 2010 and CONSORT-AI 2020 extension guideline and comprehensively summarises the state of AI RCTs in oncology.
Methods and analysis: We queried OVID MEDLINE and Embase on 22 October 2024 using AI, cancer and RCT search terms. Studies were included if they reported on an AI intervention in an RCT including participants with cancer.
Results: This study included 57 RCTs of AI interventions in oncology that were primarily focused on screening (54%) or diagnosis (19%) and intended for clinician use (88%). Among all 57 RCTs, median concordance with CONSORT 2010 and CONSORT-AI 2020 was 82%. Compared with trials published before the release of CONSORT-AI (n=8), trials published after the release of CONSORT-AI (n=49) had lower median overall CONSORT (82% vs 92%) and CONSORT 2010 (81% vs 92%) concordance but similar CONSORT-AI median concordance (93% vs 93%). Guideline items related to study methodology necessary for reproducibility using the AI intervention, such as input data inclusion and exclusion, algorithm version, low quality data handling, assessment of performance error and data accessibility, were consistently under-reported. When stratifying included trials by their overall risk of bias, trials at serious risk of bias (57%) were less concordant to CONSORT guidelines compared with trials at moderate (71%) or low (84%) risk of bias.
Conclusion: Although the majority of CONSORT and CONSORT-AI items were well-reported, critical gaps related to reporting of methodology, reproducibility and harms persist. Addressing these gaps through consideration of trial design to mitigate risks of bias coupled with standardised reporting is one step towards responsible adoption of AI to improve patient outcomes in oncology.
背景:人工智能(AI)工具在肿瘤学领域的出现,支持临床决策,减少医生工作量和自动化低效率的工作流程,带来了巨大的希望和谨慎。为了获得关于人工智能干预措施安全性和有效性的高质量证据,随机对照试验(rct)仍然是黄金标准。然而,肿瘤学人工智能试验报告的完整性和质量仍然未知。目的:本系统综述使用CONSORT(合并报告试验标准)2010和CONSORT-AI 2020扩展指南调查肿瘤人工智能干预随机对照试验的报告一致性,并全面总结肿瘤人工智能随机对照试验的现状。方法与分析:我们于2024年10月22日使用AI、cancer和RCT检索词对OVID MEDLINE和Embase进行查询。在包括癌症患者的随机对照试验中报告人工智能干预的研究被纳入。结果:本研究包括57项肿瘤学人工智能干预的随机对照试验,主要集中于筛查(54%)或诊断(19%),供临床医生使用(88%)。在所有57项随机对照试验中,与CONSORT 2010和CONSORT- ai 2020的中位一致性为82%。与CONSORT- ai发布前发表的试验(n=8)相比,CONSORT- ai发布后发表的试验(n=49)总体CONSORT (82% vs 92%)和CONSORT 2010 (81% vs 92%)的中位一致性较低,但与CONSORT- ai相似(93% vs 93%)。与使用人工智能干预可重复性所需的研究方法学相关的指南项目,如输入数据的包含和排除、算法版本、低质量数据处理、性能误差评估和数据可及性,一直未得到充分报告。当根据总体偏倚风险对纳入的试验进行分层时,严重偏倚风险(57%)的试验与中等偏倚风险(71%)或低偏倚风险(84%)的试验相比,对CONSORT指南的一致性较差。结论:尽管大多数CONSORT和CONSORT- ai项目都得到了很好的报告,但与方法学、可重复性和危害报告相关的关键差距仍然存在。通过考虑试验设计以减轻偏倚风险并结合标准化报告来解决这些差距,是负责任地采用人工智能来改善肿瘤患者预后的一步。