Differential diagnosis between low-risk and high-risk thymoma: Comparison of diagnostic performance of radiologists with and without deep learning model.
IF 0.9 Q4 RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING
{"title":"Differential diagnosis between low-risk and high-risk thymoma: Comparison of diagnostic performance of radiologists with and without deep learning model.","authors":"Yuriko Yoshida, Masahiro Yanagawa, Yukihisa Sato, Tomo Miyata, Atsushi Kawata, Akinori Hata, Noriyuki Tomiyama","doi":"10.1177/20584601241288509","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>There are few CT-based deep learning (DL) studies on thymoma according to the World Health Organization classification.</p><p><strong>Purpose: </strong>To develop a CT-based DL model to distinguish between low-risk and high-risk thymoma and to compare the diagnostic performance of radiologists with and without the DL model.</p><p><strong>Material and methods: </strong>159 patients with 160 thymomas were included. A fine-tuning VGG16 network model with Adam optimizer was used, followed by k-fold cross validation. The dataset consisted of three axial slices, including the maximum tumor size from the CT volume data. The data were augmented 50 times by rotation, zoom, shear, and horizontal/vertical flip. Three independent networks for the CT dataset were considered, and the result was determined by voting. Three radiologists independently diagnosed thymomas with and without the model. The area under the curve (AUC) of the diagnostic performance was compared using receiver operating characteristic analysis.</p><p><strong>Results: </strong>Accuracy of the DL model was 71.3%. Diagnostic performance of the radiologists was as follows: AUC and accuracy without the DL model, 0.61-0.68 and 61.9%-69.3%; and with the DL model, 0.66-0.69 and 68.1%-70.0%, respectively. AUC of the diagnostic performance showed no significant differences between radiologists with and without the DL model. The DL model tended to increase the diagnostic accuracy, but AUC was not significantly improved.</p><p><strong>Conclusion: </strong>Diagnostic performance of the DL was comparable to that of radiologists. The DL model assistance tended to increase diagnostic accuracy.</p>","PeriodicalId":72063,"journal":{"name":"Acta radiologica open","volume":null,"pages":null},"PeriodicalIF":0.9000,"publicationDate":"2024-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11457241/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta radiologica open","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/20584601241288509","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/1 0:00:00","PubModel":"eCollection","JCR":"Q4","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Background: There are few CT-based deep learning (DL) studies on thymoma according to the World Health Organization classification.
Purpose: To develop a CT-based DL model to distinguish between low-risk and high-risk thymoma and to compare the diagnostic performance of radiologists with and without the DL model.
Material and methods: 159 patients with 160 thymomas were included. A fine-tuning VGG16 network model with Adam optimizer was used, followed by k-fold cross validation. The dataset consisted of three axial slices, including the maximum tumor size from the CT volume data. The data were augmented 50 times by rotation, zoom, shear, and horizontal/vertical flip. Three independent networks for the CT dataset were considered, and the result was determined by voting. Three radiologists independently diagnosed thymomas with and without the model. The area under the curve (AUC) of the diagnostic performance was compared using receiver operating characteristic analysis.
Results: Accuracy of the DL model was 71.3%. Diagnostic performance of the radiologists was as follows: AUC and accuracy without the DL model, 0.61-0.68 and 61.9%-69.3%; and with the DL model, 0.66-0.69 and 68.1%-70.0%, respectively. AUC of the diagnostic performance showed no significant differences between radiologists with and without the DL model. The DL model tended to increase the diagnostic accuracy, but AUC was not significantly improved.
Conclusion: Diagnostic performance of the DL was comparable to that of radiologists. The DL model assistance tended to increase diagnostic accuracy.