A retrospective audit of an artificial intelligence software for the detection of intracranial haemorrhage used by a teleradiology company in the United Kingdom.
Garry Pettet, Julie West, Dennis Robert, Aneesh Khetani, Shamie Kumar, Satish Golla, Robert Lavis
{"title":"A retrospective audit of an artificial intelligence software for the detection of intracranial haemorrhage used by a teleradiology company in the United Kingdom.","authors":"Garry Pettet, Julie West, Dennis Robert, Aneesh Khetani, Shamie Kumar, Satish Golla, Robert Lavis","doi":"10.1093/bjro/tzae033","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Artificial intelligence (AI) algorithms have the potential to assist radiologists in the reporting of head computed tomography (CT) scans. We investigated the performance of an AI-based software device used in a large teleradiology practice for intracranial haemorrhage (ICH) detection.</p><p><strong>Methods: </strong>A randomly selected subset of all non-contrast CT head (NCCTH) scans from patients aged ≥18 years referred for urgent teleradiology reporting from 44 different hospitals within the United Kingdom over a 4-month period was considered for this evaluation. Thirty auditing radiologists evaluated the NCCTH scans and the AI output retrospectively. Agreement between AI and auditing radiologists is reported along with failure analysis.</p><p><strong>Results: </strong>A total of 1315 NCCTH scans from as many distinct patients (median age, 73 years [IQR 53-84]; 696 [52.9%] females) were evaluated. One hundred twelve (8.5%) scans had ICH. Overall agreement, positive percent agreement, negative percent agreement, and Gwet's AC1 of AI with radiologists were found to be 93.5% (95% CI, 92.1-94.8), 85.7% (77.8-91.6), 94.3% (92.8-95.5) and 0.92 (0.90-0.94), respectively, in detecting ICH. 9 out of 16 false negative outcomes were due to missed subarachnoid haemorrhages and these were predominantly subtle haemorrhages. The most common reason for false positive results was due to motion artefacts.</p><p><strong>Conclusions: </strong>AI demonstrated very good agreement with the radiologists in the detection of ICH.</p><p><strong>Advances in knowledge: </strong>Real-world evaluation of an AI-based CT head interpretation device is reported. Knowledge of scenarios where false negative and false positive results are possible will help reporting radiologists.</p>","PeriodicalId":72419,"journal":{"name":"BJR open","volume":"6 1","pages":"tzae033"},"PeriodicalIF":0.0000,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11522876/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BJR open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bjro/tzae033","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: Artificial intelligence (AI) algorithms have the potential to assist radiologists in the reporting of head computed tomography (CT) scans. We investigated the performance of an AI-based software device used in a large teleradiology practice for intracranial haemorrhage (ICH) detection.
Methods: A randomly selected subset of all non-contrast CT head (NCCTH) scans from patients aged ≥18 years referred for urgent teleradiology reporting from 44 different hospitals within the United Kingdom over a 4-month period was considered for this evaluation. Thirty auditing radiologists evaluated the NCCTH scans and the AI output retrospectively. Agreement between AI and auditing radiologists is reported along with failure analysis.
Results: A total of 1315 NCCTH scans from as many distinct patients (median age, 73 years [IQR 53-84]; 696 [52.9%] females) were evaluated. One hundred twelve (8.5%) scans had ICH. Overall agreement, positive percent agreement, negative percent agreement, and Gwet's AC1 of AI with radiologists were found to be 93.5% (95% CI, 92.1-94.8), 85.7% (77.8-91.6), 94.3% (92.8-95.5) and 0.92 (0.90-0.94), respectively, in detecting ICH. 9 out of 16 false negative outcomes were due to missed subarachnoid haemorrhages and these were predominantly subtle haemorrhages. The most common reason for false positive results was due to motion artefacts.
Conclusions: AI demonstrated very good agreement with the radiologists in the detection of ICH.
Advances in knowledge: Real-world evaluation of an AI-based CT head interpretation device is reported. Knowledge of scenarios where false negative and false positive results are possible will help reporting radiologists.