Joint Lesion Detection and Classification of Breast Ultrasound Video via a Clinical Knowledge-Aware Framework

IF 8.3 1区工程技术 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC

IEEE Transactions on Circuits and Systems for Video Technology Pub Date : 2024-08-30 DOI:10.1109/TCSVT.2024.3452497

Minglei Li;Wushuang Gong;Pengfei Yan;Xiang Li;Yuchen Jiang;Hao Luo;Hang Zhou;Shen Yin

{"title":"Joint Lesion Detection and Classification of Breast Ultrasound Video via a Clinical Knowledge-Aware Framework","authors":"Minglei Li;Wushuang Gong;Pengfei Yan;Xiang Li;Yuchen Jiang;Hao Luo;Hang Zhou;Shen Yin","doi":"10.1109/TCSVT.2024.3452497","DOIUrl":null,"url":null,"abstract":"Ultrasound is an important routine screening modality for breast cancer. Breast ultrasound screening is a dynamic process, and clinical practice involves radiologists recording representative frames during dynamic breast scanning for subsequent diagnosis. However, existing computer-assisted diagnosis methods often concentrate on dull diagnostic results by analyzing these representative frames and ignore the valuable information in the dynamic examination process that facilitates diagnosis. Moreover, breast lesions could exhibit various characteristic differences during scanning, and effective learning of lesion representations is challenging and may affect the clinical interpretability of the methods. To this end, we draw insights from the behavior of radiologists during the dynamic breast examination and leverage the knowledge of breast anatomy to propose a clinical knowledge-aware framework for lesion detection and classification of breast lesions in ultrasound videos. It is equipped with global-local attentive aggregation and a dynamic allocation mechanism that simulates the behavior of radiologists searching for diagnostic clues, thus integrating local localization and global semantic information from the video into the feature representation of the lesion. An anatomically-aware transformer is also designed to refine the lesion feature representation using spatial relationships within and across different anatomical layers of the breast anatomy. Extensive experiments show that the proposed framework can achieve competitive performance in both lesion detection and video classification tasks while exhibiting good clinical availability and interpretability, with an average precision of 40.80% and an AUC of 85.86% on our constructed breast video dataset and an average precision of 39.79% and an AUC of 87.04% on a publicly available dataset.","PeriodicalId":13082,"journal":{"name":"IEEE Transactions on Circuits and Systems for Video Technology","volume":"35 1","pages":"45-61"},"PeriodicalIF":8.3000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Circuits and Systems for Video Technology","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10659844/","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

Ultrasound is an important routine screening modality for breast cancer. Breast ultrasound screening is a dynamic process, and clinical practice involves radiologists recording representative frames during dynamic breast scanning for subsequent diagnosis. However, existing computer-assisted diagnosis methods often concentrate on dull diagnostic results by analyzing these representative frames and ignore the valuable information in the dynamic examination process that facilitates diagnosis. Moreover, breast lesions could exhibit various characteristic differences during scanning, and effective learning of lesion representations is challenging and may affect the clinical interpretability of the methods. To this end, we draw insights from the behavior of radiologists during the dynamic breast examination and leverage the knowledge of breast anatomy to propose a clinical knowledge-aware framework for lesion detection and classification of breast lesions in ultrasound videos. It is equipped with global-local attentive aggregation and a dynamic allocation mechanism that simulates the behavior of radiologists searching for diagnostic clues, thus integrating local localization and global semantic information from the video into the feature representation of the lesion. An anatomically-aware transformer is also designed to refine the lesion feature representation using spatial relationships within and across different anatomical layers of the breast anatomy. Extensive experiments show that the proposed framework can achieve competitive performance in both lesion detection and video classification tasks while exhibiting good clinical availability and interpretability, with an average precision of 40.80% and an AUC of 85.86% on our constructed breast video dataset and an average precision of 39.79% and an AUC of 87.04% on a publicly available dataset.

查看原文本刊更多论文

通过临床知识感知框架对乳腺超声视频进行联合病灶检测和分类

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Circuits and Systems for Video Technology 工程技术-工程：电子与电气

CiteScore

13.80

自引率

27.40%

发文量

660

审稿时长

5 months

期刊介绍： The IEEE Transactions on Circuits and Systems for Video Technology (TCSVT) is dedicated to covering all aspects of video technologies from a circuits and systems perspective. We encourage submissions of general, theoretical, and application-oriented papers related to image and video acquisition, representation, presentation, and display. Additionally, we welcome contributions in areas such as processing, filtering, and transforms; analysis and synthesis; learning and understanding; compression, transmission, communication, and networking; as well as storage, retrieval, indexing, and search. Furthermore, papers focusing on hardware and software design and implementation are highly valued. Join us in advancing the field of video technology through innovative research and insights.