{"title":"YOLOv8 Outperforms Traditional CNN Models in Mammography Classification: Insights From a Multi-Institutional Dataset","authors":"Erfan AkbarnezhadSany, Hossein EntezariZarch, Mohammad AlipoorKermani, Baharak Shahin, Mohsen Cheki, Aida Karami, Samaneh Zahedi, Zahra AhmadPour, Sadegh Ahmadi-Mazhin, Ali Rahimnezhad, Sahar Sayfollahi, Salar Bijari, Melika Shojaee, Seyed Masoud Rezaeijo","doi":"10.1002/ima.70008","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>This study evaluates the efficacy of four deep learning methods—YOLOv8, VGG16, ResNet101, and EfficientNet—for classifying mammography images into normal, benign, and malignant categories using a large-scale, multi-institutional dataset. Each dataset was divided into training and testing groups with an 80%/20% split, ensuring that all examinations from the same patient were consistently allocated to the same split. The training set for the malignant class contained 10 220 images, the benign class 6086 images, and the normal class 8526 images. For testing, the malignant class had 1441 images, the benign class 1124 images, and the normal class 1881 images. All models were fine-tuned using transfer learning and standardized to 224 × 224 pixels with data augmentation techniques to improve robustness. Among the models, YOLOv8 demonstrated the highest performance, achieving an AUC of 93.33% for the training dataset and 91% for the testing dataset. It also exhibited superior accuracy (91.82% training, 86.68% testing), F1-score (91.11% training, 84.86% testing), and specificity (95.80% training, 93.32% testing). ResNet101, VGG16, and EfficientNet also performed well, with ResNet101 achieving an AUC of 91.67% (training) and 90.00% (testing). Grad-CAM visualizations were used to identify the regions most influential in model decision-making. This multi-model evaluation highlights YOLOv8's potential for accurately classifying mammograms, while demonstrating that all models contribute valuable insights for improving breast cancer detection. Future clinical trials will focus on refining these models to assist healthcare professionals in delivering accurate and timely diagnoses.</p>\n </div>","PeriodicalId":14027,"journal":{"name":"International Journal of Imaging Systems and Technology","volume":"35 1","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Imaging Systems and Technology","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ima.70008","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
This study evaluates the efficacy of four deep learning methods—YOLOv8, VGG16, ResNet101, and EfficientNet—for classifying mammography images into normal, benign, and malignant categories using a large-scale, multi-institutional dataset. Each dataset was divided into training and testing groups with an 80%/20% split, ensuring that all examinations from the same patient were consistently allocated to the same split. The training set for the malignant class contained 10 220 images, the benign class 6086 images, and the normal class 8526 images. For testing, the malignant class had 1441 images, the benign class 1124 images, and the normal class 1881 images. All models were fine-tuned using transfer learning and standardized to 224 × 224 pixels with data augmentation techniques to improve robustness. Among the models, YOLOv8 demonstrated the highest performance, achieving an AUC of 93.33% for the training dataset and 91% for the testing dataset. It also exhibited superior accuracy (91.82% training, 86.68% testing), F1-score (91.11% training, 84.86% testing), and specificity (95.80% training, 93.32% testing). ResNet101, VGG16, and EfficientNet also performed well, with ResNet101 achieving an AUC of 91.67% (training) and 90.00% (testing). Grad-CAM visualizations were used to identify the regions most influential in model decision-making. This multi-model evaluation highlights YOLOv8's potential for accurately classifying mammograms, while demonstrating that all models contribute valuable insights for improving breast cancer detection. Future clinical trials will focus on refining these models to assist healthcare professionals in delivering accurate and timely diagnoses.
期刊介绍:
The International Journal of Imaging Systems and Technology (IMA) is a forum for the exchange of ideas and results relevant to imaging systems, including imaging physics and informatics. The journal covers all imaging modalities in humans and animals.
IMA accepts technically sound and scientifically rigorous research in the interdisciplinary field of imaging, including relevant algorithmic research and hardware and software development, and their applications relevant to medical research. The journal provides a platform to publish original research in structural and functional imaging.
The journal is also open to imaging studies of the human body and on animals that describe novel diagnostic imaging and analyses methods. Technical, theoretical, and clinical research in both normal and clinical populations is encouraged. Submissions describing methods, software, databases, replication studies as well as negative results are also considered.
The scope of the journal includes, but is not limited to, the following in the context of biomedical research:
Imaging and neuro-imaging modalities: structural MRI, functional MRI, PET, SPECT, CT, ultrasound, EEG, MEG, NIRS etc.;
Neuromodulation and brain stimulation techniques such as TMS and tDCS;
Software and hardware for imaging, especially related to human and animal health;
Image segmentation in normal and clinical populations;
Pattern analysis and classification using machine learning techniques;
Computational modeling and analysis;
Brain connectivity and connectomics;
Systems-level characterization of brain function;
Neural networks and neurorobotics;
Computer vision, based on human/animal physiology;
Brain-computer interface (BCI) technology;
Big data, databasing and data mining.