Mengying Li , Yin Fang , Jiong Shao , Yan Jiang , Guoping Xu , Xin-wu Cui , Xinglong Wu
{"title":"Vision transformer-based multimodal fusion network for classification of tumor malignancy on breast ultrasound: A retrospective multicenter study","authors":"Mengying Li , Yin Fang , Jiong Shao , Yan Jiang , Guoping Xu , Xin-wu Cui , Xinglong Wu","doi":"10.1016/j.ijmedinf.2025.105793","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>In the context of routine breast cancer diagnosis, the precise discrimination between benign and malignant breast masses holds utmost significance. Notably, few prior investigations have concurrently explored the integration of imaging histology features, deep learning characteristics, and clinical parameters. The primary objective of this retrospective study was to pioneer a multimodal feature fusion model tailored for the prediction of breast tumor malignancy, harnessing the potential of ultrasound images.</div></div><div><h3>Method</h3><div>We compiled a dataset that included clinical features from 1065 patients and 3315 image datasets. Specifically, we selected data from 603 patients for training our multimodal model. The comprehensive experimental workflow involves identifying the optimal unimodal model, extracting unimodal features, fusing multimodal features, gaining insights from these fused features, and ultimately generating prediction results using a classifier.</div></div><div><h3>Results</h3><div>Our multimodal feature fusion model demonstrates outstanding performance, achieving an AUC of 0.994 (95 % CI: 0.988–0.999) and an F1 score of 0.971 on the primary multicenter dataset. In the evaluation on two independent testing cohorts (TCs), it maintains strong performance, with AUCs of 0.942 (95 % CI: 0.854–0.994) for TC1 and 0.945 (95 % CI: 0.857–1.000) for TC2, accompanied by corresponding F1 scores of 0.872 and 0.857, respectively. Notably, the decision curve analysis reveals that our model achieves higher accuracy within the threshold probability range of approximately [0.210, 0.890] (TC1) and [0.000, 0.850] (TC2) compared to alternative methods. This capability enhances its utility in clinical decision-making, providing substantial benefits.</div></div><div><h3>Conclusion</h3><div>The multimodal model proposed in this paper can comprehensively evaluate patients’ multifaceted clinical information, achieve the prediction of benign and malignant breast ultrasound tumors, and obtain high performance indexes.</div></div>","PeriodicalId":54950,"journal":{"name":"International Journal of Medical Informatics","volume":"196 ","pages":"Article 105793"},"PeriodicalIF":3.7000,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Medical Informatics","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1386505625000103","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Background
In the context of routine breast cancer diagnosis, the precise discrimination between benign and malignant breast masses holds utmost significance. Notably, few prior investigations have concurrently explored the integration of imaging histology features, deep learning characteristics, and clinical parameters. The primary objective of this retrospective study was to pioneer a multimodal feature fusion model tailored for the prediction of breast tumor malignancy, harnessing the potential of ultrasound images.
Method
We compiled a dataset that included clinical features from 1065 patients and 3315 image datasets. Specifically, we selected data from 603 patients for training our multimodal model. The comprehensive experimental workflow involves identifying the optimal unimodal model, extracting unimodal features, fusing multimodal features, gaining insights from these fused features, and ultimately generating prediction results using a classifier.
Results
Our multimodal feature fusion model demonstrates outstanding performance, achieving an AUC of 0.994 (95 % CI: 0.988–0.999) and an F1 score of 0.971 on the primary multicenter dataset. In the evaluation on two independent testing cohorts (TCs), it maintains strong performance, with AUCs of 0.942 (95 % CI: 0.854–0.994) for TC1 and 0.945 (95 % CI: 0.857–1.000) for TC2, accompanied by corresponding F1 scores of 0.872 and 0.857, respectively. Notably, the decision curve analysis reveals that our model achieves higher accuracy within the threshold probability range of approximately [0.210, 0.890] (TC1) and [0.000, 0.850] (TC2) compared to alternative methods. This capability enhances its utility in clinical decision-making, providing substantial benefits.
Conclusion
The multimodal model proposed in this paper can comprehensively evaluate patients’ multifaceted clinical information, achieve the prediction of benign and malignant breast ultrasound tumors, and obtain high performance indexes.
期刊介绍:
International Journal of Medical Informatics provides an international medium for dissemination of original results and interpretative reviews concerning the field of medical informatics. The Journal emphasizes the evaluation of systems in healthcare settings.
The scope of journal covers:
Information systems, including national or international registration systems, hospital information systems, departmental and/or physician''s office systems, document handling systems, electronic medical record systems, standardization, systems integration etc.;
Computer-aided medical decision support systems using heuristic, algorithmic and/or statistical methods as exemplified in decision theory, protocol development, artificial intelligence, etc.
Educational computer based programs pertaining to medical informatics or medicine in general;
Organizational, economic, social, clinical impact, ethical and cost-benefit aspects of IT applications in health care.