{"title":"使用儿童腹部x线片筛查回肠结肠套叠的升级AI模型的外部验证:多中心回顾性研究。","authors":"Jeong Hoon Lee, Pyeong Hwa Kim, Nak-Hoon Son, Kyunghwa Han, Yeseul Kang, Sejin Jeong, Eun-Kyung Kim, Haesung Yoon, Sergios Gatidis, Shreyas Vasanawala, Hee Mang Yoon, Hyun Joo Shin","doi":"10.2196/72097","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Artificial intelligence (AI) is increasingly used in radiology, but its development in pediatric imaging remains limited, particularly for emergent conditions. Ileocolic intussusception is an important cause of acute abdominal pain in infants and toddlers and requires timely diagnosis to prevent complications such as bowel ischemia or perforation. While ultrasonography is the diagnostic standard due to its high sensitivity and specificity, its accessibility may be limited, especially outside tertiary centers. Abdominal radiographs (AXRs), despite their limited sensitivity, are often the first-line imaging modality in clinical practice. In this context, AI could support early screening and triage by analyzing AXRs and identifying patients who require further ultrasonography evaluation.</p><p><strong>Objective: </strong>This study aimed to upgrade and externally validate an AI model for screening ileocolic intussusception using pediatric AXRs with multicenter data and to assess the diagnostic performance of the model in comparison with radiologists of varying experience levels with and without AI assistance.</p><p><strong>Methods: </strong>This retrospective study included pediatric patients (≤5 years) who underwent both AXRs and ultrasonography for suspected intussusception. Based on the preliminary study from hospital A, the AI model was retrained using data from hospital B and validated with external datasets from hospitals C and D. Diagnostic performance of the upgraded AI model was evaluated using sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC). A reader study was conducted with 3 radiologists, including 2 trainees and 1 pediatric radiologist, to evaluate diagnostic performance with and without AI assistance.</p><p><strong>Results: </strong>Based on the previously developed AI model trained on 746 patients from hospital A, an additional 431 patients from hospital B (including 143 intussusception cases) were used for further training to develop an upgraded AI model. External validation was conducted using data from hospital C (n=68; 19 intussusception cases) and hospital D (n=90; 30 intussusception cases). The upgraded AI model achieved a sensitivity of 81.7% (95% CI 68.6%-90%) and a specificity of 81.7% (95% CI 73.3%-87.8%), with an AUC of 86.2% (95% CI 79.2%-92.1%) in the external validation set. Without AI assistance, radiologists showed lower performance (overall AUC 64%; sensitivity 49.7%; specificity 77.1%). With AI assistance, radiologists' specificity improved to 93% (difference +15.9%; P<.001), and AUC increased to 79.2% (difference +15.2%; P=.05). The least experienced reader showed the largest improvement in specificity (+37.6%; P<.001) and AUC (+14.7%; P=.08).</p><p><strong>Conclusions: </strong>The upgraded AI model improved diagnostic performance for screening ileocolic intussusception on pediatric AXRs. It effectively enhanced the specificity and overall accuracy of radiologists, particularly those with less experience in pediatric radiology. A user-friendly software platform was introduced to support broader clinical validation and underscores the potential of AI as a screening and triage tool in pediatric emergency settings.</p>","PeriodicalId":16337,"journal":{"name":"Journal of Medical Internet Research","volume":"27 ","pages":"e72097"},"PeriodicalIF":5.8000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"External Validation of an Upgraded AI Model for Screening Ileocolic Intussusception Using Pediatric Abdominal Radiographs: Multicenter Retrospective Study.\",\"authors\":\"Jeong Hoon Lee, Pyeong Hwa Kim, Nak-Hoon Son, Kyunghwa Han, Yeseul Kang, Sejin Jeong, Eun-Kyung Kim, Haesung Yoon, Sergios Gatidis, Shreyas Vasanawala, Hee Mang Yoon, Hyun Joo Shin\",\"doi\":\"10.2196/72097\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Artificial intelligence (AI) is increasingly used in radiology, but its development in pediatric imaging remains limited, particularly for emergent conditions. Ileocolic intussusception is an important cause of acute abdominal pain in infants and toddlers and requires timely diagnosis to prevent complications such as bowel ischemia or perforation. While ultrasonography is the diagnostic standard due to its high sensitivity and specificity, its accessibility may be limited, especially outside tertiary centers. Abdominal radiographs (AXRs), despite their limited sensitivity, are often the first-line imaging modality in clinical practice. In this context, AI could support early screening and triage by analyzing AXRs and identifying patients who require further ultrasonography evaluation.</p><p><strong>Objective: </strong>This study aimed to upgrade and externally validate an AI model for screening ileocolic intussusception using pediatric AXRs with multicenter data and to assess the diagnostic performance of the model in comparison with radiologists of varying experience levels with and without AI assistance.</p><p><strong>Methods: </strong>This retrospective study included pediatric patients (≤5 years) who underwent both AXRs and ultrasonography for suspected intussusception. Based on the preliminary study from hospital A, the AI model was retrained using data from hospital B and validated with external datasets from hospitals C and D. Diagnostic performance of the upgraded AI model was evaluated using sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC). A reader study was conducted with 3 radiologists, including 2 trainees and 1 pediatric radiologist, to evaluate diagnostic performance with and without AI assistance.</p><p><strong>Results: </strong>Based on the previously developed AI model trained on 746 patients from hospital A, an additional 431 patients from hospital B (including 143 intussusception cases) were used for further training to develop an upgraded AI model. External validation was conducted using data from hospital C (n=68; 19 intussusception cases) and hospital D (n=90; 30 intussusception cases). The upgraded AI model achieved a sensitivity of 81.7% (95% CI 68.6%-90%) and a specificity of 81.7% (95% CI 73.3%-87.8%), with an AUC of 86.2% (95% CI 79.2%-92.1%) in the external validation set. Without AI assistance, radiologists showed lower performance (overall AUC 64%; sensitivity 49.7%; specificity 77.1%). With AI assistance, radiologists' specificity improved to 93% (difference +15.9%; P<.001), and AUC increased to 79.2% (difference +15.2%; P=.05). The least experienced reader showed the largest improvement in specificity (+37.6%; P<.001) and AUC (+14.7%; P=.08).</p><p><strong>Conclusions: </strong>The upgraded AI model improved diagnostic performance for screening ileocolic intussusception on pediatric AXRs. It effectively enhanced the specificity and overall accuracy of radiologists, particularly those with less experience in pediatric radiology. A user-friendly software platform was introduced to support broader clinical validation and underscores the potential of AI as a screening and triage tool in pediatric emergency settings.</p>\",\"PeriodicalId\":16337,\"journal\":{\"name\":\"Journal of Medical Internet Research\",\"volume\":\"27 \",\"pages\":\"e72097\"},\"PeriodicalIF\":5.8000,\"publicationDate\":\"2025-07-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Medical Internet Research\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.2196/72097\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Medical Internet Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.2196/72097","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
摘要
背景:人工智能(AI)在放射学中的应用越来越多,但其在儿科影像学中的发展仍然有限,特别是在紧急情况下。回结肠肠套叠是婴幼儿急性腹痛的重要原因,需要及时诊断,预防肠缺血或肠穿孔等并发症。虽然超声检查因其高灵敏度和特异性而成为诊断标准,但其可及性可能有限,特别是在三级中心以外。尽管腹部x线片(axr)的灵敏度有限,但在临床实践中往往是一线成像方式。在这种情况下,人工智能可以通过分析axr和识别需要进一步超声检查评估的患者来支持早期筛查和分诊。目的:本研究旨在通过多中心数据升级并外部验证使用儿童axr筛查回结肠肠套叠的人工智能模型,并与不同经验水平的放射科医生进行比较,评估该模型在有无人工智能辅助下的诊断性能。方法:本回顾性研究纳入了因疑似肠套叠而同时行axr和超声检查的儿童患者(≤5岁)。基于A医院的初步研究,使用B医院的数据对AI模型进行重新训练,并使用C医院和d医院的外部数据集进行验证。升级后的AI模型的诊断性能通过灵敏度、特异性和受试者工作特征曲线下面积(AUC)进行评估。对3名放射科医生(包括2名培训生和1名儿科放射科医生)进行了一项读者研究,以评估有无人工智能辅助的诊断性能。结果:在先前开发的AI模型的基础上,对A医院的746例患者进行了训练,并将B医院的431例患者(包括143例肠套叠患者)进行了进一步的训练,以开发升级的AI模型。外部验证采用C医院的数据(n=68;19例肠套叠)和医院D (n=90;肠套叠30例)。升级后的AI模型在外部验证集中的灵敏度为81.7% (95% CI 68.6%-90%),特异性为81.7% (95% CI 73.3%-87.8%), AUC为86.2% (95% CI 79.2%-92.1%)。在没有人工智能辅助的情况下,放射科医生的表现较低(总体AUC为64%;灵敏度49.7%;特异性77.1%)。在人工智能的帮助下,放射科医生的特异性提高到93%(差异+15.9%;结论:升级后的AI模型提高了对儿童axr的回结肠肠套叠筛查的诊断性能。它有效地提高了放射科医生的特异性和整体准确性,特别是那些在儿科放射学方面经验较少的放射科医生。引入了一个用户友好的软件平台,以支持更广泛的临床验证,并强调了人工智能作为儿科急诊环境中筛查和分诊工具的潜力。
External Validation of an Upgraded AI Model for Screening Ileocolic Intussusception Using Pediatric Abdominal Radiographs: Multicenter Retrospective Study.
Background: Artificial intelligence (AI) is increasingly used in radiology, but its development in pediatric imaging remains limited, particularly for emergent conditions. Ileocolic intussusception is an important cause of acute abdominal pain in infants and toddlers and requires timely diagnosis to prevent complications such as bowel ischemia or perforation. While ultrasonography is the diagnostic standard due to its high sensitivity and specificity, its accessibility may be limited, especially outside tertiary centers. Abdominal radiographs (AXRs), despite their limited sensitivity, are often the first-line imaging modality in clinical practice. In this context, AI could support early screening and triage by analyzing AXRs and identifying patients who require further ultrasonography evaluation.
Objective: This study aimed to upgrade and externally validate an AI model for screening ileocolic intussusception using pediatric AXRs with multicenter data and to assess the diagnostic performance of the model in comparison with radiologists of varying experience levels with and without AI assistance.
Methods: This retrospective study included pediatric patients (≤5 years) who underwent both AXRs and ultrasonography for suspected intussusception. Based on the preliminary study from hospital A, the AI model was retrained using data from hospital B and validated with external datasets from hospitals C and D. Diagnostic performance of the upgraded AI model was evaluated using sensitivity, specificity, and the area under the receiver operating characteristic curve (AUC). A reader study was conducted with 3 radiologists, including 2 trainees and 1 pediatric radiologist, to evaluate diagnostic performance with and without AI assistance.
Results: Based on the previously developed AI model trained on 746 patients from hospital A, an additional 431 patients from hospital B (including 143 intussusception cases) were used for further training to develop an upgraded AI model. External validation was conducted using data from hospital C (n=68; 19 intussusception cases) and hospital D (n=90; 30 intussusception cases). The upgraded AI model achieved a sensitivity of 81.7% (95% CI 68.6%-90%) and a specificity of 81.7% (95% CI 73.3%-87.8%), with an AUC of 86.2% (95% CI 79.2%-92.1%) in the external validation set. Without AI assistance, radiologists showed lower performance (overall AUC 64%; sensitivity 49.7%; specificity 77.1%). With AI assistance, radiologists' specificity improved to 93% (difference +15.9%; P<.001), and AUC increased to 79.2% (difference +15.2%; P=.05). The least experienced reader showed the largest improvement in specificity (+37.6%; P<.001) and AUC (+14.7%; P=.08).
Conclusions: The upgraded AI model improved diagnostic performance for screening ileocolic intussusception on pediatric AXRs. It effectively enhanced the specificity and overall accuracy of radiologists, particularly those with less experience in pediatric radiology. A user-friendly software platform was introduced to support broader clinical validation and underscores the potential of AI as a screening and triage tool in pediatric emergency settings.
期刊介绍:
The Journal of Medical Internet Research (JMIR) is a highly respected publication in the field of health informatics and health services. With a founding date in 1999, JMIR has been a pioneer in the field for over two decades.
As a leader in the industry, the journal focuses on digital health, data science, health informatics, and emerging technologies for health, medicine, and biomedical research. It is recognized as a top publication in these disciplines, ranking in the first quartile (Q1) by Impact Factor.
Notably, JMIR holds the prestigious position of being ranked #1 on Google Scholar within the "Medical Informatics" discipline.