Development and External Validation of an Artificial Intelligence-Based Method for Scalable Chest Radiograph Diagnosis: A Multi-Country Cross-Sectional Study.
Zeye Liu, Jing Xu, Chengliang Yin, Guojing Han, Yue Che, Ge Fan, Xiaofei Li, Lixin Xie, Lei Bao, Zimin Peng, Jinduo Wang, Yan Chen, Fengwen Zhang, Wenbin Ouyang, Shouzheng Wang, Junwei Guo, Yanqiu Ma, Xiangzhi Meng, Taibing Fan, Aihua Zhi, Dawaciren, Kang Yi, Tao You, Yuejin Yang, Jue Liu, Yi Shi, Yuan Huang, Xiangbin Pan
{"title":"Development and External Validation of an Artificial Intelligence-Based Method for Scalable Chest Radiograph Diagnosis: A Multi-Country Cross-Sectional Study.","authors":"Zeye Liu, Jing Xu, Chengliang Yin, Guojing Han, Yue Che, Ge Fan, Xiaofei Li, Lixin Xie, Lei Bao, Zimin Peng, Jinduo Wang, Yan Chen, Fengwen Zhang, Wenbin Ouyang, Shouzheng Wang, Junwei Guo, Yanqiu Ma, Xiangzhi Meng, Taibing Fan, Aihua Zhi, Dawaciren, Kang Yi, Tao You, Yuejin Yang, Jue Liu, Yi Shi, Yuan Huang, Xiangbin Pan","doi":"10.34133/research.0426","DOIUrl":null,"url":null,"abstract":"<p><p><b>Problem:</b> Chest radiography is a crucial tool for diagnosing thoracic disorders, but interpretation errors and a lack of qualified practitioners can cause delays in treatment. <b>Aim:</b> This study aimed to develop a reliable multi-classification artificial intelligence (AI) tool to improve the accuracy and efficiency of chest radiograph diagnosis. <b>Methods:</b> We developed a convolutional neural network (CNN) capable of distinguishing among 26 thoracic diagnoses. The model was trained and externally validated using 795,055 chest radiographs from 13 datasets across 4 countries. <b>Results:</b> The CNN model achieved an average area under the curve (AUC) of 0.961 across all 26 diagnoses in the testing set. COVID-19 detection achieved perfect accuracy (AUC 1.000, [95% confidence interval {CI}, 1.000 to 1.000]), while effusion or pleural effusion detection showed the lowest accuracy (AUC 0.8453, [95% CI, 0.8417 to 0.8489]). In external validation, the model demonstrated strong reproducibility and generalizability within the local dataset, achieving an AUC of 0.9634 for lung opacity detection (95% CI, 0.9423 to 0.9702). The CNN outperformed both radiologists and nonradiological physicians, particularly in trans-device image recognition. Even for diseases not specifically trained on, such as aortic dissection, the AI model showed considerable scalability and enhanced diagnostic accuracy for physicians of varying experience levels (all <i>P</i> < 0.05). Additionally, our model exhibited no gender bias (<i>P</i> > 0.05). <b>Conclusion:</b> The developed AI algorithm, now available as professional web-based software, substantively improves chest radiograph interpretation. This research advances medical imaging and offers substantial diagnostic support in clinical settings.</p>","PeriodicalId":21120,"journal":{"name":"Research","volume":"7 ","pages":"0426"},"PeriodicalIF":11.0000,"publicationDate":"2024-08-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11301699/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.34133/research.0426","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/1/1 0:00:00","PubModel":"eCollection","JCR":"Q1","JCRName":"Multidisciplinary","Score":null,"Total":0}
引用次数: 0
Abstract
Problem: Chest radiography is a crucial tool for diagnosing thoracic disorders, but interpretation errors and a lack of qualified practitioners can cause delays in treatment. Aim: This study aimed to develop a reliable multi-classification artificial intelligence (AI) tool to improve the accuracy and efficiency of chest radiograph diagnosis. Methods: We developed a convolutional neural network (CNN) capable of distinguishing among 26 thoracic diagnoses. The model was trained and externally validated using 795,055 chest radiographs from 13 datasets across 4 countries. Results: The CNN model achieved an average area under the curve (AUC) of 0.961 across all 26 diagnoses in the testing set. COVID-19 detection achieved perfect accuracy (AUC 1.000, [95% confidence interval {CI}, 1.000 to 1.000]), while effusion or pleural effusion detection showed the lowest accuracy (AUC 0.8453, [95% CI, 0.8417 to 0.8489]). In external validation, the model demonstrated strong reproducibility and generalizability within the local dataset, achieving an AUC of 0.9634 for lung opacity detection (95% CI, 0.9423 to 0.9702). The CNN outperformed both radiologists and nonradiological physicians, particularly in trans-device image recognition. Even for diseases not specifically trained on, such as aortic dissection, the AI model showed considerable scalability and enhanced diagnostic accuracy for physicians of varying experience levels (all P < 0.05). Additionally, our model exhibited no gender bias (P > 0.05). Conclusion: The developed AI algorithm, now available as professional web-based software, substantively improves chest radiograph interpretation. This research advances medical imaging and offers substantial diagnostic support in clinical settings.
期刊介绍:
Research serves as a global platform for academic exchange, collaboration, and technological advancements. This journal welcomes high-quality research contributions from any domain, with open arms to authors from around the globe.
Comprising fundamental research in the life and physical sciences, Research also highlights significant findings and issues in engineering and applied science. The journal proudly features original research articles, reviews, perspectives, and editorials, fostering a diverse and dynamic scholarly environment.