Victoria Wu,Andrea Fung,Bahar Khodabakhshian,Baraa Abdelsamad,Hooman Vaseli,Neda Ahmadi,Jamie A D Goco,Michael Y Tsang,Christina Luong,Purang Abolmaesumi,Teresa S M Tsang
{"title":"超声心动图主动脉狭窄分类的多模态标签噪声鲁棒框架。","authors":"Victoria Wu,Andrea Fung,Bahar Khodabakhshian,Baraa Abdelsamad,Hooman Vaseli,Neda Ahmadi,Jamie A D Goco,Michael Y Tsang,Christina Luong,Purang Abolmaesumi,Teresa S M Tsang","doi":"10.1109/tmi.2025.3609319","DOIUrl":null,"url":null,"abstract":"Aortic stenosis (AS), a prevalent and serious heart valve disorder, requires early detection but remains difficult to diagnose in routine practice. Although echocardiography with Doppler imaging is the clinical standard, these assessments are typically limited to trained specialists. Point-of-care ultrasound (POCUS) offers an accessible alternative for AS screening but is restricted to basic 2D B-mode imaging, often lacking the analysis Doppler provides. Our project introduces MultiASNet, a multimodal machine learning framework designed to enhance AS screening with POCUS by combining 2D B-mode videos with structured data from echocardiography reports, including Doppler parameters. Using contrastive learning, MultiASNet aligns video features with report features in tabular form from the same patient to improve interpretive quality. To address misalignment where a single report corresponds to multiple video views, some irrelevant to AS diagnosis, we use cross-attention in a transformer-based video and tabular network to assign less importance to irrelevant report data. The model integrates structured data only during training, enabling independent use with B-mode videos during inference for broader accessibility. MultiASNet also incorporates sample selection to counteract label noise from observer variability, yielding improved accuracy on two datasets. We achieved balanced accuracy scores of 93.0% on a private dataset and 83.9% on the public TMED-2 dataset for AS detection. For severity classification, balanced accuracy scores were 80.4% and 59.4% on the private and public datasets, respectively. This model facilitates reliable AS screening in non-specialist settings, bridging the gap left by Doppler data while reducing noise-related errors. Our code is publicly available at github.com/DeepRCL/MultiASNet.","PeriodicalId":13418,"journal":{"name":"IEEE Transactions on Medical Imaging","volume":"15 1","pages":""},"PeriodicalIF":9.8000,"publicationDate":"2025-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"MultiASNet: Multimodal Label Noise Robust Framework for the Classification of Aortic Stenosis in Echocardiography.\",\"authors\":\"Victoria Wu,Andrea Fung,Bahar Khodabakhshian,Baraa Abdelsamad,Hooman Vaseli,Neda Ahmadi,Jamie A D Goco,Michael Y Tsang,Christina Luong,Purang Abolmaesumi,Teresa S M Tsang\",\"doi\":\"10.1109/tmi.2025.3609319\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Aortic stenosis (AS), a prevalent and serious heart valve disorder, requires early detection but remains difficult to diagnose in routine practice. Although echocardiography with Doppler imaging is the clinical standard, these assessments are typically limited to trained specialists. Point-of-care ultrasound (POCUS) offers an accessible alternative for AS screening but is restricted to basic 2D B-mode imaging, often lacking the analysis Doppler provides. Our project introduces MultiASNet, a multimodal machine learning framework designed to enhance AS screening with POCUS by combining 2D B-mode videos with structured data from echocardiography reports, including Doppler parameters. Using contrastive learning, MultiASNet aligns video features with report features in tabular form from the same patient to improve interpretive quality. To address misalignment where a single report corresponds to multiple video views, some irrelevant to AS diagnosis, we use cross-attention in a transformer-based video and tabular network to assign less importance to irrelevant report data. The model integrates structured data only during training, enabling independent use with B-mode videos during inference for broader accessibility. MultiASNet also incorporates sample selection to counteract label noise from observer variability, yielding improved accuracy on two datasets. We achieved balanced accuracy scores of 93.0% on a private dataset and 83.9% on the public TMED-2 dataset for AS detection. For severity classification, balanced accuracy scores were 80.4% and 59.4% on the private and public datasets, respectively. This model facilitates reliable AS screening in non-specialist settings, bridging the gap left by Doppler data while reducing noise-related errors. Our code is publicly available at github.com/DeepRCL/MultiASNet.\",\"PeriodicalId\":13418,\"journal\":{\"name\":\"IEEE Transactions on Medical Imaging\",\"volume\":\"15 1\",\"pages\":\"\"},\"PeriodicalIF\":9.8000,\"publicationDate\":\"2025-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Medical Imaging\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1109/tmi.2025.3609319\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Medical Imaging","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1109/tmi.2025.3609319","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
MultiASNet: Multimodal Label Noise Robust Framework for the Classification of Aortic Stenosis in Echocardiography.
Aortic stenosis (AS), a prevalent and serious heart valve disorder, requires early detection but remains difficult to diagnose in routine practice. Although echocardiography with Doppler imaging is the clinical standard, these assessments are typically limited to trained specialists. Point-of-care ultrasound (POCUS) offers an accessible alternative for AS screening but is restricted to basic 2D B-mode imaging, often lacking the analysis Doppler provides. Our project introduces MultiASNet, a multimodal machine learning framework designed to enhance AS screening with POCUS by combining 2D B-mode videos with structured data from echocardiography reports, including Doppler parameters. Using contrastive learning, MultiASNet aligns video features with report features in tabular form from the same patient to improve interpretive quality. To address misalignment where a single report corresponds to multiple video views, some irrelevant to AS diagnosis, we use cross-attention in a transformer-based video and tabular network to assign less importance to irrelevant report data. The model integrates structured data only during training, enabling independent use with B-mode videos during inference for broader accessibility. MultiASNet also incorporates sample selection to counteract label noise from observer variability, yielding improved accuracy on two datasets. We achieved balanced accuracy scores of 93.0% on a private dataset and 83.9% on the public TMED-2 dataset for AS detection. For severity classification, balanced accuracy scores were 80.4% and 59.4% on the private and public datasets, respectively. This model facilitates reliable AS screening in non-specialist settings, bridging the gap left by Doppler data while reducing noise-related errors. Our code is publicly available at github.com/DeepRCL/MultiASNet.
期刊介绍:
The IEEE Transactions on Medical Imaging (T-MI) is a journal that welcomes the submission of manuscripts focusing on various aspects of medical imaging. The journal encourages the exploration of body structure, morphology, and function through different imaging techniques, including ultrasound, X-rays, magnetic resonance, radionuclides, microwaves, and optical methods. It also promotes contributions related to cell and molecular imaging, as well as all forms of microscopy.
T-MI publishes original research papers that cover a wide range of topics, including but not limited to novel acquisition techniques, medical image processing and analysis, visualization and performance, pattern recognition, machine learning, and other related methods. The journal particularly encourages highly technical studies that offer new perspectives. By emphasizing the unification of medicine, biology, and imaging, T-MI seeks to bridge the gap between instrumentation, hardware, software, mathematics, physics, biology, and medicine by introducing new analysis methods.
While the journal welcomes strong application papers that describe novel methods, it directs papers that focus solely on important applications using medically adopted or well-established methods without significant innovation in methodology to other journals. T-MI is indexed in Pubmed® and Medline®, which are products of the United States National Library of Medicine.