Roger T. Tomihama MD , Justin R. Camara MD , Sharon C. Kiang MD
{"title":"Machine learning analysis of confounding variables of a convolutional neural network specific for abdominal aortic aneurysms","authors":"Roger T. Tomihama MD , Justin R. Camara MD , Sharon C. Kiang MD","doi":"10.1016/j.jvssci.2022.11.004","DOIUrl":null,"url":null,"abstract":"<div><h3>Objective</h3><p>To identify confounding variables influencing the accuracy of a convolutional neural network (CNN) specific for infrarenal abdominal aortic aneurysms (AAAs) on computed tomography angiograms (CTAs).</p></div><div><h3>Methods</h3><p>A Health Insurance Portability and Accountability Act-compliant, institutional review board-approved, retrospective study analyzed abdominopelvic CTA scans from 200 patients with infrarenal AAAs and 200 propensity-matched control patients. An AAA-specific trained CNN was developed by the application of transfer learning to the VGG-16 base model using model training, validation, and testing techniques. Model accuracy and area under the curve were analyzed based on data sets (selected, balanced, or unbalanced), aneurysm size, extra-abdominal extension, dissections, and mural thrombus. Misjudgments were analyzed by review of heatmaps, via gradient weighted class activation, overlaid on CTA images.</p></div><div><h3>Results</h3><p>The trained custom CNN model reported high test group accuracies of 94.1%, 99.1%, and 99.6% and area under the curve of 0.9900, 0.9998, and 0.9993 in selected (n = 120), balanced (n = 3704), and unbalanced image sets (n = 31,899), respectively. Despite an eightfold difference between balanced and unbalanced image sets, the CNN model demonstrated high test group sensitivities (98.7% vs 98.9%) and specificities (99.7% vs 99.3%) in unbalanced and balanced image sets, respectively. For aneurysm size, the CNN model demonstrates decreasing misjudgments as aneurysm size increases: 47% (16/34) for aneurysms <3.3 cm, 32% (11/34) for aneurysms 3.3 to 5 cm, and 20% (7/34) for aneurysms >5 cm. Aneurysms containing measurable mural thrombus were over-represented within type II (false-negative) misjudgments compared with type I (false-positive) misjudgments (71% vs 15%, <em>P</em> < .05). Inclusion of extra-abdominal aneurysm extension (thoracic or iliac artery) or dissection flaps in these imaging sets did not decrease the model's overall accuracy, indicating that the model performance was excellent without the need to clean the data set of confounding or comorbid diagnoses.</p></div><div><h3>Conclusions</h3><p>Analysis of an AAA-specific CNN model can accurately screen and identify infrarenal AAAs on CTA despite varying pathology and quantitative data sets. The highest anatomic misjudgments were with small aneurysms (<3.3 cm) or the presence of mural thrombus. Accuracy of the CNN model is maintained despite the inclusion of extra-abdominal pathology and imbalanced data sets.</p></div>","PeriodicalId":74035,"journal":{"name":"JVS-vascular science","volume":"4 ","pages":"Article 100096"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ftp.ncbi.nlm.nih.gov/pub/pmc/oa_pdf/a6/a7/main.PMC10245322.pdf","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"JVS-vascular science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666350322000840","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 2
Abstract
Objective
To identify confounding variables influencing the accuracy of a convolutional neural network (CNN) specific for infrarenal abdominal aortic aneurysms (AAAs) on computed tomography angiograms (CTAs).
Methods
A Health Insurance Portability and Accountability Act-compliant, institutional review board-approved, retrospective study analyzed abdominopelvic CTA scans from 200 patients with infrarenal AAAs and 200 propensity-matched control patients. An AAA-specific trained CNN was developed by the application of transfer learning to the VGG-16 base model using model training, validation, and testing techniques. Model accuracy and area under the curve were analyzed based on data sets (selected, balanced, or unbalanced), aneurysm size, extra-abdominal extension, dissections, and mural thrombus. Misjudgments were analyzed by review of heatmaps, via gradient weighted class activation, overlaid on CTA images.
Results
The trained custom CNN model reported high test group accuracies of 94.1%, 99.1%, and 99.6% and area under the curve of 0.9900, 0.9998, and 0.9993 in selected (n = 120), balanced (n = 3704), and unbalanced image sets (n = 31,899), respectively. Despite an eightfold difference between balanced and unbalanced image sets, the CNN model demonstrated high test group sensitivities (98.7% vs 98.9%) and specificities (99.7% vs 99.3%) in unbalanced and balanced image sets, respectively. For aneurysm size, the CNN model demonstrates decreasing misjudgments as aneurysm size increases: 47% (16/34) for aneurysms <3.3 cm, 32% (11/34) for aneurysms 3.3 to 5 cm, and 20% (7/34) for aneurysms >5 cm. Aneurysms containing measurable mural thrombus were over-represented within type II (false-negative) misjudgments compared with type I (false-positive) misjudgments (71% vs 15%, P < .05). Inclusion of extra-abdominal aneurysm extension (thoracic or iliac artery) or dissection flaps in these imaging sets did not decrease the model's overall accuracy, indicating that the model performance was excellent without the need to clean the data set of confounding or comorbid diagnoses.
Conclusions
Analysis of an AAA-specific CNN model can accurately screen and identify infrarenal AAAs on CTA despite varying pathology and quantitative data sets. The highest anatomic misjudgments were with small aneurysms (<3.3 cm) or the presence of mural thrombus. Accuracy of the CNN model is maintained despite the inclusion of extra-abdominal pathology and imbalanced data sets.