Addressing Limited Generalizability in Artificial Intelligence-Based Brain Aneurysm Detection for Computed Tomography Angiography: Development of an Externally Validated Artificial Intelligence Screening Platform.

IF 3.9 2区医学 Q1 CLINICAL NEUROLOGY

Neurosurgery Pub Date : 2025-06-09 DOI:10.1227/neu.0000000000003549

Samuel D Pettersson, Jean Filo, Peter Liaw, Paulina Skrzypkowska, Tomasz Klepinowski, Tomasz Szmuda, Thomas B Fodor, Felipe Ramirez-Velandia, Piotr Zieliński, Yu-Ming Chang, Philipp Taussky, Christopher S Ogilvy

{"title":"Addressing Limited Generalizability in Artificial Intelligence-Based Brain Aneurysm Detection for Computed Tomography Angiography: Development of an Externally Validated Artificial Intelligence Screening Platform.","authors":"Samuel D Pettersson, Jean Filo, Peter Liaw, Paulina Skrzypkowska, Tomasz Klepinowski, Tomasz Szmuda, Thomas B Fodor, Felipe Ramirez-Velandia, Piotr Zieliński, Yu-Ming Chang, Philipp Taussky, Christopher S Ogilvy","doi":"10.1227/neu.0000000000003549","DOIUrl":null,"url":null,"abstract":"Background and objectives: Brain aneurysm detection models, both in the literature and in industry, continue to lack generalizability during external validation, limiting clinical adoption. This challenge is largely due to extensive exclusion criteria during training data selection. The authors developed the first model to achieve generalizability using novel methodological approaches.Methods: Computed tomography angiography (CTA) scans from 2004 to 2023 at the study institution were used for model training, including untreated unruptured intracranial aneurysms without extensive cerebrovascular disease. External validation used digital subtraction angiography-verified CTAs from an international center, while prospective validation occurred at the internal institution over 9 months. A public web platform was created for further model validation.Results: A total of 2194 CTA scans were used for this study. One thousand five hundred eighty-seven patients and 1920 aneurysms with a mean size of 5.3 ± 3.7 mm were included in the training cohort. The mean age of the patients was 69.7 ± 14.9 years, and 1203 (75.8%) were female. The model achieved a training Dice score of 0.88 and a validation Dice score of 0.76. Prospective internal validation on 304 scans yielded a lesion-level (LL) sensitivity of 82.5% (95% CI: 75.5-87.9) and specificity of 89.6 (95% CI: 84.5-93.2). External validation on 303 scans demonstrated an on-par LL sensitivity and specificity of 83.5% (95% CI: 75.1-89.4) and 92.9% (95% CI: 88.8-95.6), respectively. Radiologist LL sensitivity from the external center was 84.5% (95% CI: 76.2-90.2), and 87.5% of the missed aneurysms were detected by the model.Conclusion: The authors developed the first publicly testable artificial intelligence model for aneurysm detection on CTA scans, demonstrating generalizability and state-of-the-art performance in external validation. The model addresses key limitations of previous efforts and enables broader validation through a web-based platform.","PeriodicalId":19276,"journal":{"name":"Neurosurgery","volume":" ","pages":""},"PeriodicalIF":3.9000,"publicationDate":"2025-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurosurgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1227/neu.0000000000003549","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Background and objectives: Brain aneurysm detection models, both in the literature and in industry, continue to lack generalizability during external validation, limiting clinical adoption. This challenge is largely due to extensive exclusion criteria during training data selection. The authors developed the first model to achieve generalizability using novel methodological approaches.

Methods: Computed tomography angiography (CTA) scans from 2004 to 2023 at the study institution were used for model training, including untreated unruptured intracranial aneurysms without extensive cerebrovascular disease. External validation used digital subtraction angiography-verified CTAs from an international center, while prospective validation occurred at the internal institution over 9 months. A public web platform was created for further model validation.

Results: A total of 2194 CTA scans were used for this study. One thousand five hundred eighty-seven patients and 1920 aneurysms with a mean size of 5.3 ± 3.7 mm were included in the training cohort. The mean age of the patients was 69.7 ± 14.9 years, and 1203 (75.8%) were female. The model achieved a training Dice score of 0.88 and a validation Dice score of 0.76. Prospective internal validation on 304 scans yielded a lesion-level (LL) sensitivity of 82.5% (95% CI: 75.5-87.9) and specificity of 89.6 (95% CI: 84.5-93.2). External validation on 303 scans demonstrated an on-par LL sensitivity and specificity of 83.5% (95% CI: 75.1-89.4) and 92.9% (95% CI: 88.8-95.6), respectively. Radiologist LL sensitivity from the external center was 84.5% (95% CI: 76.2-90.2), and 87.5% of the missed aneurysms were detected by the model.

Conclusion: The authors developed the first publicly testable artificial intelligence model for aneurysm detection on CTA scans, demonstrating generalizability and state-of-the-art performance in external validation. The model addresses key limitations of previous efforts and enables broader validation through a web-based platform.

查看原文本刊更多论文

解决基于人工智能的脑动脉瘤检测在计算机断层造影中的有限通用性：开发一个外部验证的人工智能筛查平台。

背景和目的：脑动脉瘤检测模型，无论是在文献中还是在工业中，在外部验证过程中仍然缺乏通用性，限制了临床应用。这一挑战主要是由于在训练数据选择过程中广泛的排除标准。作者开发了第一个模型，以实现推广使用新的方法方法。方法：使用该研究机构2004年至2023年的计算机断层血管造影（CTA）扫描进行模型训练，包括未经治疗的未破裂的颅内动脉瘤，无广泛脑血管疾病。外部验证使用来自国际中心的数字减影血管造影验证cta，而前瞻性验证在内部机构进行，时间超过9个月。为进一步的模型验证创建了一个公共web平台。结果：本研究共使用了2194次CTA扫描。训练队列包括1587例患者和1920个平均大小为5.3±3.7 mm的动脉瘤。患者平均年龄69.7±14.9岁，女性1203例（75.8%）。该模型的训练Dice得分为0.88，验证Dice得分为0.76。304次扫描的前瞻性内部验证显示病变水平（LL）敏感性为82.5% (95% CI: 75.5-87.9)，特异性为89.6 （95% CI: 84.5-93.2）。303次扫描的外部验证显示，LL的敏感性和特异性分别为83.5% （95% CI: 75.1-89.4）和92.9% （95% CI: 88.8-95.6）。放射科医师外中心的LL敏感性为84.5% (95% CI: 76.2-90.2)， 87.5%的漏诊动脉瘤被该模型检出。结论：作者开发了第一个可公开测试的CTA扫描动脉瘤检测人工智能模型，在外部验证中展示了通用性和最先进的性能。该模型解决了以前工作的主要局限性，并通过基于web的平台进行了更广泛的验证。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neurosurgery 医学-临床神经学

CiteScore

8.20

自引率

6.20%

发文量

898

审稿时长

2-4 weeks

期刊介绍： Neurosurgery, the official journal of the Congress of Neurological Surgeons, publishes research on clinical and experimental neurosurgery covering the very latest developments in science, technology, and medicine. For professionals aware of the rapid pace of developments in the field, this journal is nothing short of indispensable as the most complete window on the contemporary field of neurosurgery. Neurosurgery is the fastest-growing journal in the field, with a worldwide reputation for reliable coverage delivered with a fresh and dynamic outlook.