{"title":"Deep learning for tubes and lines detection in critical illness: Generalizability and comparison with residents","authors":"Pootipong Wongveerasin, Trongtum Tongdee, Pairash Saiviroonporn","doi":"10.1016/j.ejro.2024.100593","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><p>Artificial intelligence (AI) has been proven useful for the assessment of tubes and lines on chest radiographs of general patients. However, validation on intensive care unit (ICU) patients remains imperative.</p></div><div><h3>Methods</h3><p>This retrospective case-control study evaluated the performance of deep learning (DL) models for tubes and lines classification on both an external public dataset and a local dataset comprising 303 films randomly sampled from the ICU database. The endotracheal tubes (ETTs), central venous catheters (CVCs), and nasogastric tubes (NGTs) were classified into “Normal,” “Abnormal,” or “Borderline” positions by DL models with and without rule-based modification. Their performance was evaluated using an experienced radiologist as the standard reference.</p></div><div><h3>Results</h3><p>The algorithm showed decreased performance on the local ICU dataset, compared to that of the external dataset, decreasing from the Area Under the Curve of Receiver (AUC) of 0.967 (95 % CI 0.965–0.973) to the AUC of 0.70 (95 % CI 0.68–0.77). Significant improvement in the ETT classification task was observed after modifications were made to the model to allow the use of the spatial relationship between line tips and reference anatomy with the improvement of the AUC, increasing from 0.71 (95 % CI 0.70 – 0.75) to 0.86 (95 % CI 0.83 – 0.94)</p></div><div><h3>Conclusions</h3><p>The externally trained model exhibited limited generalizability on the local ICU dataset. Therefore, evaluating the performance of externally trained AI before integrating it into critical care routine is crucial. Rule-based algorithm may be used in combination with DL to improve results.</p></div>","PeriodicalId":38076,"journal":{"name":"European Journal of Radiology Open","volume":null,"pages":null},"PeriodicalIF":1.8000,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2352047724000480/pdfft?md5=e3984dd26f8e8aa3a7cf367184907496&pid=1-s2.0-S2352047724000480-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Radiology Open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352047724000480","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Artificial intelligence (AI) has been proven useful for the assessment of tubes and lines on chest radiographs of general patients. However, validation on intensive care unit (ICU) patients remains imperative.
Methods
This retrospective case-control study evaluated the performance of deep learning (DL) models for tubes and lines classification on both an external public dataset and a local dataset comprising 303 films randomly sampled from the ICU database. The endotracheal tubes (ETTs), central venous catheters (CVCs), and nasogastric tubes (NGTs) were classified into “Normal,” “Abnormal,” or “Borderline” positions by DL models with and without rule-based modification. Their performance was evaluated using an experienced radiologist as the standard reference.
Results
The algorithm showed decreased performance on the local ICU dataset, compared to that of the external dataset, decreasing from the Area Under the Curve of Receiver (AUC) of 0.967 (95 % CI 0.965–0.973) to the AUC of 0.70 (95 % CI 0.68–0.77). Significant improvement in the ETT classification task was observed after modifications were made to the model to allow the use of the spatial relationship between line tips and reference anatomy with the improvement of the AUC, increasing from 0.71 (95 % CI 0.70 – 0.75) to 0.86 (95 % CI 0.83 – 0.94)
Conclusions
The externally trained model exhibited limited generalizability on the local ICU dataset. Therefore, evaluating the performance of externally trained AI before integrating it into critical care routine is crucial. Rule-based algorithm may be used in combination with DL to improve results.