Akash Awasthi , Brandon Chung , Anh Mai Vu , Saba Khan , Ngan Le , Zhigang Deng , Rishi Agrawal , Carol C. Wu , Hien Van Nguyen
{"title":"Structural chain of thoughts for radiology education","authors":"Akash Awasthi , Brandon Chung , Anh Mai Vu , Saba Khan , Ngan Le , Zhigang Deng , Rishi Agrawal , Carol C. Wu , Hien Van Nguyen","doi":"10.1016/j.knosys.2025.114433","DOIUrl":null,"url":null,"abstract":"<div><div>Radiology education requires trainees to develop both perceptual and interpretive expertise. However, refinement of these skills is often impeded by the limited availability of mentorship, a consequence of the demanding schedules of experienced radiologists. This lack of personalized guidance makes it difficult for learners to recognize the mistakes they make, understand why those errors occurred and how to refine their perceptual processes. Many of these errors arise from subtle differences in visual attention, such as failing to fixate on an abnormality, allocating an insufficient fixation time, or overlooking an abnormality despite scanning the correct region. Although Large Language Models (LLMs) and Large Multimodal Models (LMMs) have been explored for radiology tasks, they often struggle to detect such fine-grained multimodal variations, particularly when comparing gaze behavior between experts and trainees. To address these limitations, we introduce Structural Chain of Thoughts (SCoT), a novel framework that enhances LLMs and LMMs sensitivity to nuanced multimodal differences by structuring gaze data and radiology report into a thought graph. By leveraging a structural prior, SCoT systematically identifies key perceptual and interpretive discrepancies, allowing models to provide targeted, context-aware feedback. This structured approach not only highlights missed findings but also explains the reasoning behind perceptual errors, turning them into learning opportunities. Applied within radiology education, SCoT bridges the gap between expert and novice performance, offering a scalable solution for AI-driven diagnostic training. We further contribute a simulated dataset of perceptual errors in chest X-ray (CXR) interpretation, facilitating future research into multimodal reasoning and AI-driven medical education. Unlike conventional Chain-of-Thought approaches, SCoT explicitly integrates gaze and textual information into a structured reasoning process, yielding interpretable, fine-grained, and personalized feedback tailored to the unique needs of radiology training. The code and data will be available here: <span><span>GitHub Repository</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"330 ","pages":"Article 114433"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705125014728","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Radiology education requires trainees to develop both perceptual and interpretive expertise. However, refinement of these skills is often impeded by the limited availability of mentorship, a consequence of the demanding schedules of experienced radiologists. This lack of personalized guidance makes it difficult for learners to recognize the mistakes they make, understand why those errors occurred and how to refine their perceptual processes. Many of these errors arise from subtle differences in visual attention, such as failing to fixate on an abnormality, allocating an insufficient fixation time, or overlooking an abnormality despite scanning the correct region. Although Large Language Models (LLMs) and Large Multimodal Models (LMMs) have been explored for radiology tasks, they often struggle to detect such fine-grained multimodal variations, particularly when comparing gaze behavior between experts and trainees. To address these limitations, we introduce Structural Chain of Thoughts (SCoT), a novel framework that enhances LLMs and LMMs sensitivity to nuanced multimodal differences by structuring gaze data and radiology report into a thought graph. By leveraging a structural prior, SCoT systematically identifies key perceptual and interpretive discrepancies, allowing models to provide targeted, context-aware feedback. This structured approach not only highlights missed findings but also explains the reasoning behind perceptual errors, turning them into learning opportunities. Applied within radiology education, SCoT bridges the gap between expert and novice performance, offering a scalable solution for AI-driven diagnostic training. We further contribute a simulated dataset of perceptual errors in chest X-ray (CXR) interpretation, facilitating future research into multimodal reasoning and AI-driven medical education. Unlike conventional Chain-of-Thought approaches, SCoT explicitly integrates gaze and textual information into a structured reasoning process, yielding interpretable, fine-grained, and personalized feedback tailored to the unique needs of radiology training. The code and data will be available here: GitHub Repository.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.