Congyu Tian , Yan Hu , Meng Zhang , Xiangyun Liao , Jianping Lv , Weixin Si
{"title":"Self-prompt contextual learning with AxialMamba for multi-label segmentation in carotid ultrasound","authors":"Congyu Tian , Yan Hu , Meng Zhang , Xiangyun Liao , Jianping Lv , Weixin Si","doi":"10.1016/j.eswa.2025.126749","DOIUrl":null,"url":null,"abstract":"<div><div>Plaque and vessel segmentation in carotid ultrasound videos is critical for assessing carotid artery stenosis and providing essential information for doctors’ diagnostic and treatment planning. However, most existing methods segment vessels and plaques without distinguishing the plaque types and their corresponding vascular segments. To address this limitation, we define a novel multi-label carotid ultrasound video segmentation task that categorizes vessels based on their anatomical locations and classifies plaques according to their echo characteristics. To address this task, we constructed a novel dataset, CAUS45, comprising 7479 annotated frames from 45 patients. In this dataset, vessels are segmented into three categories: the internal carotid artery (ICA), external carotid artery (ECA), and common carotid artery (CCA). Plaques are classified based on echogenicity into three types: weakly echogenic, moderately echogenic, and strongly echogenic. To further advance this task, we propose a self-prompt contextual segmentation framework, termed SPCNet. To address the challenges posed by the significant variability in ultrasound images, we leveraged foundational models pretrained on large-scale ultrasound datasets as part of our video clip encoder to extract features from individual frames. To effectively utilize the inter-frame contextual information within a clip, we propose a novel AxialMamba module designed for extracting inter-frame features. Additionally, to fully exploit the correlation between different clips within a video, we introduce a self-prompted contextual learning strategy to establish contextual dependencies across clips. Experiments demonstrate that SPCNet achieves a Dice coefficient of 89.08%, with a 3.04% improvement over the current state-of-the-art method. Additionally, SPCNet achieves a Hausdorff Distance (HD) of 5.04 and an Average Surface Distance (ASD) of 1.21 on our private CAUS45 dataset. Our method shows the great potential to be applied in practical large-scale screening.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"274 ","pages":"Article 126749"},"PeriodicalIF":7.5000,"publicationDate":"2025-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425003719","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Plaque and vessel segmentation in carotid ultrasound videos is critical for assessing carotid artery stenosis and providing essential information for doctors’ diagnostic and treatment planning. However, most existing methods segment vessels and plaques without distinguishing the plaque types and their corresponding vascular segments. To address this limitation, we define a novel multi-label carotid ultrasound video segmentation task that categorizes vessels based on their anatomical locations and classifies plaques according to their echo characteristics. To address this task, we constructed a novel dataset, CAUS45, comprising 7479 annotated frames from 45 patients. In this dataset, vessels are segmented into three categories: the internal carotid artery (ICA), external carotid artery (ECA), and common carotid artery (CCA). Plaques are classified based on echogenicity into three types: weakly echogenic, moderately echogenic, and strongly echogenic. To further advance this task, we propose a self-prompt contextual segmentation framework, termed SPCNet. To address the challenges posed by the significant variability in ultrasound images, we leveraged foundational models pretrained on large-scale ultrasound datasets as part of our video clip encoder to extract features from individual frames. To effectively utilize the inter-frame contextual information within a clip, we propose a novel AxialMamba module designed for extracting inter-frame features. Additionally, to fully exploit the correlation between different clips within a video, we introduce a self-prompted contextual learning strategy to establish contextual dependencies across clips. Experiments demonstrate that SPCNet achieves a Dice coefficient of 89.08%, with a 3.04% improvement over the current state-of-the-art method. Additionally, SPCNet achieves a Hausdorff Distance (HD) of 5.04 and an Average Surface Distance (ASD) of 1.21 on our private CAUS45 dataset. Our method shows the great potential to be applied in practical large-scale screening.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.