YOLOv8 Outperforms Traditional CNN Models in Mammography Classification: Insights From a Multi-Institutional Dataset

IF 3 4区计算机科学 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

International Journal of Imaging Systems and Technology Pub Date : 2024-12-16 DOI:10.1002/ima.70008

Erfan AkbarnezhadSany, Hossein EntezariZarch, Mohammad AlipoorKermani, Baharak Shahin, Mohsen Cheki, Aida Karami, Samaneh Zahedi, Zahra AhmadPour, Sadegh Ahmadi-Mazhin, Ali Rahimnezhad, Sahar Sayfollahi, Salar Bijari, Melika Shojaee, Seyed Masoud Rezaeijo

{"title":"YOLOv8 Outperforms Traditional CNN Models in Mammography Classification: Insights From a Multi-Institutional Dataset","authors":"Erfan AkbarnezhadSany, Hossein EntezariZarch, Mohammad AlipoorKermani, Baharak Shahin, Mohsen Cheki, Aida Karami, Samaneh Zahedi, Zahra AhmadPour, Sadegh Ahmadi-Mazhin, Ali Rahimnezhad, Sahar Sayfollahi, Salar Bijari, Melika Shojaee, Seyed Masoud Rezaeijo","doi":"10.1002/ima.70008","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>This study evaluates the efficacy of four deep learning methods—YOLOv8, VGG16, ResNet101, and EfficientNet—for classifying mammography images into normal, benign, and malignant categories using a large-scale, multi-institutional dataset. Each dataset was divided into training and testing groups with an 80%/20% split, ensuring that all examinations from the same patient were consistently allocated to the same split. The training set for the malignant class contained 10 220 images, the benign class 6086 images, and the normal class 8526 images. For testing, the malignant class had 1441 images, the benign class 1124 images, and the normal class 1881 images. All models were fine-tuned using transfer learning and standardized to 224 × 224 pixels with data augmentation techniques to improve robustness. Among the models, YOLOv8 demonstrated the highest performance, achieving an AUC of 93.33% for the training dataset and 91% for the testing dataset. It also exhibited superior accuracy (91.82% training, 86.68% testing), F1-score (91.11% training, 84.86% testing), and specificity (95.80% training, 93.32% testing). ResNet101, VGG16, and EfficientNet also performed well, with ResNet101 achieving an AUC of 91.67% (training) and 90.00% (testing). Grad-CAM visualizations were used to identify the regions most influential in model decision-making. This multi-model evaluation highlights YOLOv8's potential for accurately classifying mammograms, while demonstrating that all models contribute valuable insights for improving breast cancer detection. Future clinical trials will focus on refining these models to assist healthcare professionals in delivering accurate and timely diagnoses.</p>\n </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Imaging Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ima.70008","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

Abstract

This study evaluates the efficacy of four deep learning methods—YOLOv8, VGG16, ResNet101, and EfficientNet—for classifying mammography images into normal, benign, and malignant categories using a large-scale, multi-institutional dataset. Each dataset was divided into training and testing groups with an 80%/20% split, ensuring that all examinations from the same patient were consistently allocated to the same split. The training set for the malignant class contained 10 220 images, the benign class 6086 images, and the normal class 8526 images. For testing, the malignant class had 1441 images, the benign class 1124 images, and the normal class 1881 images. All models were fine-tuned using transfer learning and standardized to 224 × 224 pixels with data augmentation techniques to improve robustness. Among the models, YOLOv8 demonstrated the highest performance, achieving an AUC of 93.33% for the training dataset and 91% for the testing dataset. It also exhibited superior accuracy (91.82% training, 86.68% testing), F1-score (91.11% training, 84.86% testing), and specificity (95.80% training, 93.32% testing). ResNet101, VGG16, and EfficientNet also performed well, with ResNet101 achieving an AUC of 91.67% (training) and 90.00% (testing). Grad-CAM visualizations were used to identify the regions most influential in model decision-making. This multi-model evaluation highlights YOLOv8's potential for accurately classifying mammograms, while demonstrating that all models contribute valuable insights for improving breast cancer detection. Future clinical trials will focus on refining these models to assist healthcare professionals in delivering accurate and timely diagnoses.

查看原文本刊更多论文

YOLOv8 在乳腺 X 射线摄影分类中的表现优于传统 CNN 模型：来自多机构数据集的启示

本研究评估了四种深度学习方法（yolov8、VGG16、ResNet101和efficientnet）在使用大规模、多机构数据集将乳房x线摄影图像分为正常、良性和恶性类别方面的效果。每个数据集被分成训练组和测试组，分成80%/20%，确保来自同一患者的所有检查都一致地分配到同一组。恶性类的训练集包含10 220张图像，良性类的训练集包含6086张图像，正常类的训练集包含8526张图像。恶性分类有1441张，良性分类有1124张，正常分类有1881张。所有模型都使用迁移学习进行微调，并使用数据增强技术将其标准化到224 × 224像素，以提高鲁棒性。在这些模型中，YOLOv8表现出了最高的性能，训练数据集的AUC为93.33%，测试数据集的AUC为91%。准确率（训练组91.82%，测试组86.68%）、f1评分（训练组91.11%，测试组84.86%）和特异性（训练组95.80%，测试组93.32%）均较优。ResNet101、VGG16和EfficientNet也表现良好，其中ResNet101的AUC为91.67%（训练）和90.00%（测试）。使用Grad-CAM可视化来识别对模型决策影响最大的区域。这一多模型评估突出了YOLOv8在准确分类乳房x线照片方面的潜力，同时表明所有模型都为提高乳腺癌检测提供了有价值的见解。未来的临床试验将侧重于改进这些模型，以帮助医疗保健专业人员提供准确和及时的诊断。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Imaging Systems and Technology 工程技术-成像科学与照相技术

CiteScore

6.90

自引率

6.10%

发文量

138

审稿时长

3 months

期刊介绍： The International Journal of Imaging Systems and Technology (IMA) is a forum for the exchange of ideas and results relevant to imaging systems, including imaging physics and informatics. The journal covers all imaging modalities in humans and animals. IMA accepts technically sound and scientifically rigorous research in the interdisciplinary field of imaging, including relevant algorithmic research and hardware and software development, and their applications relevant to medical research. The journal provides a platform to publish original research in structural and functional imaging. The journal is also open to imaging studies of the human body and on animals that describe novel diagnostic imaging and analyses methods. Technical, theoretical, and clinical research in both normal and clinical populations is encouraged. Submissions describing methods, software, databases, replication studies as well as negative results are also considered. The scope of the journal includes, but is not limited to, the following in the context of biomedical research: Imaging and neuro-imaging modalities: structural MRI, functional MRI, PET, SPECT, CT, ultrasound, EEG, MEG, NIRS etc.; Neuromodulation and brain stimulation techniques such as TMS and tDCS; Software and hardware for imaging, especially related to human and animal health; Image segmentation in normal and clinical populations; Pattern analysis and classification using machine learning techniques; Computational modeling and analysis; Brain connectivity and connectomics; Systems-level characterization of brain function; Neural networks and neurorobotics; Computer vision, based on human/animal physiology; Brain-computer interface (BCI) technology; Big data, databasing and data mining.